Reverse Code Engineering RCE CD +sandman 2000

home *** CD-ROM | disk | FTP | other *** search

/ Reverse Code Engineering RCE CD +sandman 2000 / ReverseCodeEngineeringRceCdsandman2000.iso / RCE / Library / Assembly Programming Journal / apj_2.txt < prev next >

Wrap

Text File | 2000-05-25 | 179.3 KB | 4,668 lines

::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. Dec 98/Jan 99 :::\_____\::::::::::. Issue 2 ::::::::::::::::::::::......................................................... A S S E M B L Y P R O G R A M M I N G J O U R N A L http://asmjournal.freeservers.com asmjournal@mailcity.com T A B L E O F C O N T E N T S ---------------------------------------------------------------------- Introduction...................................................mammon_ "Keygen Coding Competition".................................Ghiribizzo "How to Use A86 for Beginners".................................Linuxjr "Using the Gnu AS Assembler"...................................mammon_ "A Guide to NASM for TASM Coders"..................................Gij "Tips on saving bytes in ASM programs"...................Larry Hammick Column: Win32 Assembly Programming "A Simple Window".........................................Iczelion "Painting with Text"......................................Iczelion Column: The C Standard Library in Assembly "The _Xprintf functions"....................................Xbios2 Column: The Unix World "X-Windows in Assembly Language: Part I"...................mammon_ Column: Assembly Language Snippets "IsASCII?"............................................Troy Benoist "ENUM, CallTable"..........................................mammon_ Column: Issue Solution "PE Solution"...............................................Xbios2 ---------------------------------------------------------------------- +++++++++++++++++++++++Issue Challenge++++++++++++++++++++ Write the smallest possible PE program that outputs its command line ---------------------------------------------------------------------- ::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. :::\_____\:::::::::::..............................................INTRODUCTION by mammon_ Wow! This issue is huge. More than twice the size of the last; maybe it is time to go monthly... This issue has as its theme --such as it were-- the use of popular free- and shareware assemblers. It began with my needing to write a GAS intro to accompany my X-Windows article; shortly thereafter, Linuxjr emailed me the benefits of his university training with his A86 tutorial (beginners: this is for you! Linuxjr explains *everything*). I then appealed to Gij to allow me to incorporate his Nasm 'Quick-Start' guiide, which I have used often...he posted the condition that I edit it heavily ;) I would like to draw your attention first to our new column: Assembly Language Snippets. Originally this was an idea which I and a few others had; however, I never received any contributions for the 'Snippets' section. Then I received an email from Troy with the first one... I pulled the rest out of my various asm sources and voila, a new column was born. This is something that is fully open to contributions; asm snippets --and we will need lots-- may be emailed to asmjournal@mailcity.com or mammon_@hotmail.com, or they may be posted to the Message Board at http://pluto.beseen.com/boardroom/q/19784/ Basic format should be: ;Name: Name to title you with ;Routine Title: Name to title the snippet with ;Summary: One-Line Description ;Comaptibility Specific Assemblers or OSes this works with ;Notes: Any extra notes you have --Code-- I should point out here that freeservers.com is not very reliable; thus the APJ home page is inaccessible more often than not. For this reason I have set up a mirror on my own page, at http://www.eccentrica.org/Mammon/APJ/index.html As for this issue's articles, we once again have two fine Win32 asm tutorials by Iczelion, who maintains an excellent page at http://iczelion.cjb.net (with a Win32 asm message board!). Ghirribizzo has supplied his fun Key Generator Competition results (I can't say I was surprised when I saw the winner ;). Larry Hammick --who also maintains an excellent, smoking-enabled page at http://www3.bc.sympatico.ca/hammick/-- has contributed a fantastic piece on asm optimization. XBios2 has this time gone above and beyond, not only with the C Language in Assembly but with his Issue Challenge as well... asm coders and reverse engineers alike should read this. As for the issue challenge, XBios2 did not provide me with one for next issue, so I used one from a text I found on the Internet somewhere... he has been emailed the text and can try to beat it ;) Also, I am going to be setting up a page for reader responses to the Issue Challenges -- readers can anticipate the solutions before each issue comes out, or try and best the solution afterwards. Submissions can be sent to the same places as the Snippets. Author Bio's? I know mainstream mags do this-- if you want one, send one. I'll tack it onto the end of the article ... anything within reason: URL, email, hobbies, perversions, favorite drink, favorite linux distro, etc. Next Issue: How many articles on Code Optimization can I get? That would make a great theme (with the foundation laid this issue)--anything from code theory to PentiumII-specific optimizations would be welcome. Prospective articles, send to me or post on the MB...no topic is unacceptable unless you can in no way possible relate it to assembly language. Enjoy the ish, _m ::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. :::\_____\:::::::::::...........................................FEATURE.ARTICLE Keygen Coding Competition by Ghiribizzo Introduction ------------ The competition was to write the smallest key generator for the simple serial scheme I wrote as a trainer for newbies. I had a few reasons for starting this competition: ╖ To give the newbies a chance to participate in a competition ╖ To give old hands the chance to brush up on their assembly skills ╖ To promote tight assembly coding ╖ To demonstrate the various different methods used to improve efficiency in coding Well, I'm back from my short European jaunt and the competition is now closed. I have greatly enjoyed the entire competition, from the coding of the crackme and the chats with various crackers on IRC through to deciding the winner and writing this document. Analysis of the Serial Scheme ----------------------------- The serial scheme was kept deliberately simple as it was written for newbies to train with. The scheme took a name of up to 16 bytes long and required a 16 byte serial number. There was a 256 byte lookup table that was indexed directly with the ASCII values of the name field. The name was padded to a length of 16 (if necessary) using values hardcoded into the scheme. The 256 byte lookup table was created using eight maximal 8 bit linear feedback shift registers (LFSRs) in parallel i.e. producing one output byte per 'clock'. The LFSRs were initialised to produce 'Ghir_OCU' as the first 8 output bytes. The table was precomputed and it was not expected that the cracker recognise the nature of the lookup table - although a post I made to the cracking forum about LFSRs might have tipped the more astute crackers! The rules of the competition required that some standard interface text be included which strongly urged the use of service 9 interrupt 21h - though this would probably be used in any case - and discouraged blank screens and other unfriendly UIs from being used to save bytes. Also, the rules specified a range of input to be handled smaller than the possible 256 maximum. Due to the simple nature of the serial scheme, this meant that the lookup table could immediately be stripped down to the input range. I envisioned that there would be 3 æfightsÆ. One to reduce the original algorithm, second to reduce the packed table lookup algorithm and the last to reduce the LFSR algorithm. As it turned it, everyone seemed to go for the packed table option. The Entrants ------------ The following entrants have been included because they illustrate the different ideas and methods used to reach the common goal of reducing code size. I didn't realise that so many crackers would use the precomputed table method. Perhaps word got out during IRC chats and everybody started using them? In any case, this didnÆt reduce the size cutting war as precomputation had its own routines that needed to be optimised. Ghiribizzo Alpha (223 bytes) ---------------------------- This was not an entrant as it would hardly be fair for me to enter the competition knowing how the lookup table was generated! This keygen was basically converted from the crackme and improved 'on-the-fly' by generating the lookup table in code and tidying up routines where they were obviously inefficient. No great thought went into this and the code size was just to give myself an idea of what crackers would be aiming for. Aside from generating the lookup table, the only other unusual feature of this keygen was the use of the XLAT command instead of the standard indexing used in the crackme. I didn't stop to check whether this used less space or not, but included it as newbies may not be familiar with the XLAT instruction. As it happened, the XLAT instruction was used in SpyderÆs keygen. From the size I got from this keygen, I tried to guess a required key input range to put this size between thestraight table precomputation and the packed table precomputation. One thing to note is how I ended the program. I was quite surprised by the fact that nobody else seemed to know that you could quit com programs with a ret instruction. Further size savings can be made by using BbÆs trick of keeping DH and also by tweaking the generator to fix some of the bitstreams produced to give us the bits we need and save later processing. Cruehead Alpha (244 bytes) -------------------------- I got this from Cruehead on IRC when I asked to see what he had managed so far. Although this version is unfinished it is still impressive. The keygen relies on precomputing the whole table and reducing the keygen to a single table lookup. The coding is very simple - almost seems as if Cruehead was typing the steps going through his head straight onto the keyboard (perhaps he was?) the resulting code is consequently very easy to understand and follow. Bb #10 (230 bytes) ------------------ Bb has written an excellent keygen. He has put some serious hard work into this including taking the time to calculate the dx offsets manually instead of just using the æoffsetÆ feature that the compiler provides. It has been fun watching BbÆs keygen progress as the first one I received was version 5 which was 256 bytes long. The keygen presented here is version 10. There are other nice bits and bobs throughout this code. This makes it quite frustrating as in various places so much space is blatantly wasted. Just take a look at the last 6 lines of code! There shouldnÆt even be 6 lines there! IÆm sure Bb will learn a lot from seeing some of the other keygens here and IÆm sure he will do very well should he enter the next competition. Spyder (211 bytes) ------------------ Tidy, compact and elegantly coded. A little sparse in commenting (it seems like Spyder coerced IDA to write the keygen for him ;-p). The table lookup is an interesting piece of code. VoidLord (247 bytes) -------------------- Another keygen using the idea of a packed precomputed table. VoidLordÆs first keygen. LetÆs hope we see more! Honourable Mentions ------------------- Special mention given to Trykka who managed to deduce how the look-up table was created - but never sent in an entry! The Winner ---------- Well it looks like Spyder is the winner by quite a large margin. Incidentally, I have just made a quick check that the keygens work. You might be able to bump yourself up on the scale by picking holes in the other keygens :-) Rankings -------- __Keygen______Size________Author______ kgen.com 211 Spyder kg.com 224 Ghiribizzo (alpha) kg10.com 230 Bb kg9.com 233 Bb kg6.com 239 Bb kgvoid.com 247 VoidLord kgcrue.com 255 Cruehead (alpha) kg5.com 256 Bb kgt.com 529 Serial Scheme Final Words ----------- There have been some excellent ideas in the keygens. However, none of the keygens are as small as they could be. They all have some scope for improvement. By combining some of the ideas given in the above keygens, we could create a new smaller keygen. It will be interesting to see what the smallest possible keygen would look like. I hope that everyone who has taken part in the competition, or who has followed it, has gained something from it. I hope that there will be more entries for the next competition! The Source Codes ---------------- ; GhiribizzoÆs Keygen ========================================================= .model tiny .386 .code .startup ; The first part of the code is the table generator ; Note that we can actually do some æprecomputingÆ by ; fixing some of the bits in the generator to produce ; the bits that we need. This will save some bytes ; in the serial section. I have not bothered to do this. mov ax, 5547h mov bx, 6869h mov cx, 725fh mov dx, 4f43h mov di, offset PRD mov si, offset PRD + 0ffh LFSR: stosb ;Save MSB mov bp,ax mov al,ah and ax,0ffh xchg ax,bp ;Tap xor ah,bl xor ah,ch xor ah,al ;Shift mov al,bh mov bh,bl mov bl,ch mov ch,cl mov cl,dh mov dh,dl ;Store MSB and dx,0ff00h or dx,bp cmp di,si jle LFSR ;----------------------------------------------------------------- mov ah,9 mov dx,offset startMsg int 21h mov ah,10 mov dx,offset NameInput int 21h ;----------------------------------------------------------------- mov si,offset NameBuffer mov di,offset NameHash mov bx,offset Table1 MakeSerial: lodsb xlat and al,3fh or al,30h byteOK: cmp al,39h jle keepit add al,7 keepit: stosb cmp di,offset stopbyte jl MakeSerial ;----------------------------------------------------------------- mov dx,offset NH2 printMsg: mov ah,9 int 21h exit: ret StartMsg db 0dh,0ah,'OCU Keggen #1 ',0feh,' Ghiribizzo 1998 ',0dh,0ah db 0dh,0ah,'Enter Name : $' NameInput db 17 NameRead db ? NameBuffer db 'mk3 "![]ns)%3x#0Z' nh2 db 0dh,0ah,'Serial Number: ' NameHash db 16 dup('y') stopbyte db 0dh,0ah,'$' Table1: PRD: END ; CrueheadÆs Keygen =========================================================== .model tiny .386 .stack .data StartMsg db 0dh,0ah,'OCU Keggen #1 ',0feh,' Cruehead 1998 ',0dh,0ah db 0dh,0ah,'Enter Name : $' SerialMsg db 0dh,0ah,'Serial Number: ' NameVar db 011h,0h,06Bh,06bh,033h,020h,022h,021h,05bh,05dh,06eh db 073h,029h,025h,033h,078h,023h,030h,'$' Table db 037h,035h,034h,031h,036h,032h,046h,044h,046h,044h,044h db 031h,035h,035h,038h,035h,036h,046h,032h,045h,036h,030h db 031h,039h,033h,034h,030h,046h,031h,042h,044h,030h,043h db 036h,043h,035h,039h,045h,039h,033h,036h,043h,037h,035h db 036h,044h,045h,036h,032h,044h,031h,037h,039h,030h,031h db 042h,046h,043h,034h,032h,031h,035h,037h,034h,044h,032h db 032h,032h,030h,043h,034h,030h,044h,044h,033h,039h,044h db 043h,038h,036h,031h,038h,041h,037h,034h,046h,045h,041h db 036h,044h,043h,041h,041h,039h,043h,037h .code .startup mov ah,09h lea dx,StartMsg int 21h mov ah,0ah lea dx,NameVar int 21h OnceAgain: mov bl,NameVar[di+4] cmp bl,0dh jne noprob mov bl,02bh noprob: mov al,table[bx-020h] mov NameVar[di+2],al inc di cmp di,0Eh jne OnceAgain mov word ptr NameVar[16],00a0dh mov ah,09h lea dx,SerialMsg int 21h .exit end ; BbÆs Keygen ================================================================= ; KG10 - Ghiribizzo KeyGen ; written by bb 12Sep98 1:30AM ; next revision 13Sep98 5:00PM ; yet more changes - 26Sep98 - late late night ; eat 3 more bytes 28Sep98 ; ; comments where the evils lay ; ; I just knew that I HAD to make this thing 256 bytes of less. Beware: This ; is NOT an example of good coding practice! I almost wish I could do a ; "bytes saved" comparison for all the little hacks. ; ; I've gotten this to assemble under TASM. It MUST assemble as a 16-bit COM file, ; and even then I can't guarantee that the offsets will remain stable between ; various assemblers. Let me restate that: I CAN guarantee that this won't ; work for you when you try and assemble it yourself. :) ; P8086 MODEL TINY DATASEG OffsetStartMsg EQU 52h OffsetMySerial EQU 7fh OffsetSerialMsg EQU 91h OffsetMyName EQU 0a3h StartMsg db 0dh,0ah,'OCU Keggen #1 ',0feh,' ----- bb ----- 1998',0dh,0ah ; There's no reason not to re-use this section of the StartMsg, since it fits ; perfectly though code had to be added to affix a linefeed MySerial db 0dh,0ah,'Enter Name : $' SerialMsg db 0dh,0ah,'Serial Number: $' ; previous change to MyName not needed anymore MyName db 11h, 0h, 6Dh, 6Bh, 33h, 20h, 22h, 21h, 5Bh, 5Dh, 6Eh, 73h, 29h, db 25h, 33h, 78h, 23h, 30h, 5Ah ; Not only does the full table not need to be used, but since it's basically a ; substitution cypher we can fit everything into these 96 or so bytes ; Also, the trailing commented-out 37h saves us one byte. It's the substitution for 7Fh, ; but since 7Fh is a DELETE when using 0a/int21h, it never gets accepted by KGT.COM or by ; this keygen. Therefore, it's useless and unneeded. ; NewTable db '754162FDFDD155856F2E6019340F1BD0C6C59E936C756DE62D17901BFC421574 ; D2220C40DD39DC8618A74FEA6DCAA9C';, 37h ; and I missed the fact that it also only uses characters 0-9 and A-F ; which can be expressed in 4 bits, cutting the 96 byte table in half NewTable db 75h, 41h, 62h, 0FDh, 0FDh, 0D1h, 55h, 85h db 6Fh, 2Eh, 60h, 19h, 34h, 0Fh, 1Bh, 0D0h db 0C6h, 0C5h, 9Eh, 93h, 6Ch, 75h, 6Dh, 0E6h db 2Dh, 17h, 90h, 1Bh, 0FCh, 42h, 15h, 74h db 0D2h, 22h, 0Ch, 40h, 0DDh, 39h, 0DCh, 86h db 18h, 0A7h, 4Fh, 0EAh, 6Dh, 0CAh, 0A9h, 0C7h CODESEG STARTUPCODE ; A note here: We're at <256 bytes and we fit snugly between 0100h-0200h in memory. ; Therefore, any offset to text that we need is going to have a constant value for ; DH, namely 01h. By initializing DH once at this next line of code, we never need ; to change DH again, only DL. We'll save a few bytes here and there because of it, ; though it's more work to find the offsets manually after assembly, and then hard- ;coding them in and re-assembling. I suppose there might be some construct like ; offset ( MyName AND 00ffh ), but I didn't really look into it. EQU will work. mov dx, offset StartMsg mov ah,09h int 21h ; save a byte mov dl, OffsetMyName mov ah,0ah ; Now that we're through with the StartMsg, we can adjust MySerial to print a linefeed. ; I can save a byte here by using the AH register instead of a 0AH immediate value, ; since AH is now set to 0AH for the int21 get-string-from-keyboard. mov [MySerial+10h], ah int 21h ; 2 into DL for a division during the main loop mov dl, 2 ; We start at the END of MyName and work our way backwards, because we can avoid the CMP ; and simply check for the Signed flag when BP rolls over. We save a couple of bytes. mov bp, 0fh ; Also, I shaved a few bytes out of this by using BP in place of BX, avoiding the PUSH/POPs ; which I shouldn't have done anyway since I didn't define a new stack for the application. loop1: xor ah, ah ; need to clear ah and bh, unfortunately. xor bh, bh mov al, [bp+MyName+2] sub al,20h ; if the sub sets carry, then we're probably the carriage return jnc skipcr ; so we'll set ourselves = to something that has the same table mov al, 03h ; lookup value as the carriage return. skipcr: div dl ;after the DIV, AL will be two table values, and AH will decide which ; one we should use mov bl, al ; we need table lookup through bx, not al mov al,[bx+NewTable] test ah, dh ;since dh always=1,test ah,dh will save us a byte over test ah,01 jne skipshift ; if AH=0, use least significant nibble ; if AH=1, use most significant nibble by shifting MSN into LSN ; TASM assembles shr al, 4 as shr al, 1 four times.. we don't want that. db 0c0h, 0e8h, 4 ; shr al, 4 skipshift: and al, 0Fh ; strip off high nibble add al, 30h ; and turn into printable [0-9A-F] character cmp al, 39h jle numnum add al, 7 numnum: mov [MySerial+bp],al dec bp jns loop1 ; loop until bp flips ; save another "offset" byte mov dl, OffsetSerialMsg mov ah, 9 int 21h ; save another byte mov dl, OffsetMySerial ; AH should already == 9, no need to specify it here. int 21h ; End of the line mov ah,4ch int 21h END ; SpyderÆs Keygen============================================================== ; Ghiribizzo's Key Generator Competition entry by Spyder ; Sheesh you get assembler source and you want comments? ; Only one nibble of each byte in the original key table holds useful ; information. Only key table entries in the range 20..0x7F and 0x0D are ; needed - those 60 nibbles are packed into a 30 byte table, 0x0D is handled ; as a special case. ; The rest is just space concious assembler with a few wrinkles to save ; bytes. I worry I may have missed some pattern in the key table, could it ; be generated or derived? Otherwise I'm pretty happy with the result. .286 seg000 segment byte public 'CODE' assume cs:seg000 org 100h assume es:nothing, ss:nothing, ds:seg000 public start start proc near mov ah, 9 mov dx, offset StartMsg int 21h ; Sign on mov ah, 0Ah mov dx, offset Buffer int 21h ; Get name mov si, offset BufferCont ; Set up for loop mov di, offset Serial mov bx, offset Key - 10h xor ax,ax mov cx,10h loop1: lodsb ; cmp al,0dh ; don't need this because we arranged the data ; jnz skip0 ; before the key table to give the right code ; mov al,'p' ; for this out of range case skip0: sar al,1 xlat jc skip1 sar al,4 skip1: and al,0fh add al,'0' cmp al,'9' jle skip2 add al,7 skip2: stosb loop loop1 movsw movsb mov ah,9 mov dx,offset SerialMsg int 21h int 20h start endp Buffer db 11h ; db 0 ; BufferCont db 'm' db 'k' db '3' db ' ' db '"' db '!' db '[' db ']' db 'n' db 's' db ')' db '%' db '3' db 'x' db '#' db '0' db 0dh, 0ah, '$' StartMsg db 0dh,0ah,'OCU Keggen #1 ',0feh,' ----- spyder ----- 1998',0dh,0ah db 0dh,0ah,'Enter Name : $' db 0 ; A crucial spacer Key db 075h, 041h, 062h, 0FDh, 0FDh, 0D1h, 055h, 085h db 06Fh, 02Eh, 060h, 019h, 034h, 00Fh, 01Bh, 0D0h db 0C6h, 0C5h, 09Eh, 093h, 06Ch, 075h, 06Dh, 0E6h db 02Dh, 017h, 090h, 01Bh, 0FCh, 042h, 015h, 074h db 0D2h, 022h, 00Ch, 040h, 0DDh, 039h, 0DCh, 086h db 018h, 0A7h, 04Fh, 0EAh, 06Dh, 0CAh, 0A9h, 0C7h SerialMsg db 0dh,0ah,'Serial Number: ' Serial: seg000 ends end start ; VoidLordÆs Keygen============================================================ ; OCU Keygen #1 | VoidLord 1998 ; Category: newbie (this is my first keygen) ; Solution: ; for the every possible input char (20h-7fh) the "serial" char is stored in the ; Table. Since the output chars can only be 0-9 and A-F, we can store two chars ; in one byte, reducing the table size to 48 bytes. seg000 segment byte public 'CODE' assume cs:seg000 org 100h assume es:nothing, ss:nothing, ds:seg000 start proc near mov ah, 9 ; DOS - Write starting message lea dx, StartMsg int 21h mov ah, 0ah ; DOS - read Name lea dx, Serial int 21h xor ax, ax xor bx, bx loop1: mov al, [Serial2+bx] ; the output will be in the same buffer cmp al, 0dh ; end of input string (odh) ? jne no_cr mov [Serial2+bx], '1' ; the output char will be '1' jmp finish ; the remaining chars are OK already no_cr: push bx ; now we should translate the namechar sub al, 20h ; to the serial number char, using the mov bx,ax ; lookup Table shr bx,1 ; we have two chars in one byte in the Table and al, 1 jnz odd ; is this char "even or odd" ? mov al, [Table1+bx] and al, 0fh ; if even, use the lower 4 bits jmp end_l odd: mov al, [Table1+bx] mov cl, 4 shr al, cl ; if odd, use the higher 4 bits end_l: pop bx ; translate the number to the hex char cmp al, 10 ; is it digit 0-9 or letter A-F jl digit add al, 7 ; if letter, add 7 digit: add al, '0' ; if digit, just add '0' mov [Serial2+bx], al inc bx ; process next input char cmp bx, 10h jl loop1 finish: mov Serial, ':' ; complete the output string mov Serial+1,' ' mov ah, 9 ; DOS - Print solution lea dx, SerialMsg int 21h mov ah, 4Ch ; DOS - QUIT with EXIT int 21h start endp StartMsg db 0dh,0ah,'OCU Keygen #1 ',0feh db ' ----- VoidLord ----- 1998' db 0dh, 0ah, 0dh,0ah,'Enter Name : $' SerialMsg db 0dh,0ah,'Serial Number' Serial db 11h, 0 Serial2 db 67, 57, 69, 55, 52, 53, 50, 53, 56, 55 db 68, 50, 69, 54, 49, 54, 0dh, 0ah, '$' Table1 db 87, 20, 38, 223, 223, 29, 85, 88, 246, 226 db 6, 145, 67, 240, 177, 13, 108, 92, 233, 57 db 198, 87, 214, 110, 210, 113, 9, 177, 207, 36 db 81, 71, 45, 34, 192, 4, 221, 147, 205, 104 db 129, 122, 244, 174, 214, 172, 154, 124 seg000 ends end start ::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. :::\_____\:::::::::::...........................................FEATURE.ARTICLE How to Use A86 for Beginners by Linuxjr Requirements: -Basic Dos knowledge like copying and renaming files and such I am writing this paper for I find plenty of tutorials and books all about assembly and how to write programs and how to do loops, if/else statments, etc... But one thing I did not see plenty of is tutorials on how to set up the assembler of choice that you grow fond of, for instance nasm, a86, tasm, masm GAS, etc. So I am writing about a86 and I'm using my college notes and experience I learned from my Assembly class. I hope this will help you enjoy a86 and encourage you to learn how to manage up to x286 opcodes and 16-bit code before you start tackling with 32bit and Windows programming in assembler. This is a sort of warning that you will only be able to write DOS programs but you have to learn how to crawl before you can walk, and you have to learn how to walk before you can run. I hope to show you how to set up a86, how to write a few simple programs with the template I use, and how to do some basic stuff in assembler. I took a college course on Assembly a couple of months ago, and I was happy to learn the internals of the system and how to manipulate the registers for some awesome results. The assembler that we used was a86 by Eric Isaacson. This is a shareware program, meaning you get to play with it before buying it. To get this assembler go to - http://www.eji.com/a86/ - and you will see where to download the programs. It is in a zip file and you just unzip it with your favorite program like winzip or pkunzip. You should also download d86, the debugger, for use with your a86 programs. Once you downloaded them, unzip the files to a directory such as c:\a86, or even put on a floppy disk if you are worried about space. Getting Started --------------- Let's get into it: you've got the assembler and the debugger, what next? First of all, we have to make a text file since all asm source code is nothing but a plain text code that has a bunch of operands and functions to do what you want your program to do. I start all my a86 programming by opening up my template.asm, which what I got from school; it is a useful template and it makes a good dos .EXE when you compile it with the supplied batch file. Cut the following code and save it in a text file called template.asm: X--------Begin Cutting here-------------------------------------------- ; PROGRAM : ; ; AUTHOR : ; ; PURPOSE : ; ; PROGRAM OUTLINE : ; ;============================== EXTERN =============================== ;=========================== STACK SEGMENT =========================== sseg segment para stack 'STACK' db 100H dup ( ? ) ; allow 256 bytes of memory for ; use by our program stack. sseg ends ;============================ DATA SEGMENT ============================ dseg segment para 'DATA' dseg ends ;============================ CODE SEGMENT ============================ cseg segment para 'CODE' ; Begin the Code segment containing executable machine code program proc far ; Actual program code is completely ; contained in the FAR procedure ; named PROGRAM assume cs:cseg, ds:dseg, ss:sseg ; Set Data Segment Register to point to the Data Segment of this program mov ax,dseg mov ds,ax ;=============== Rest of MAIN PROGRAM code goes here ================== exit: mov ax,4c00h ; terminate program execution and int 21h ; transfer control to DOS program endp ; end of the procedure program ;============================ PROCEDURES ============================== cseg ends ; End of the code segment containing ; executable program. end program ; The final End statement signals the ; end of this program source file, and ; gives the starting address of the ; executable program X--------Stop Cutting here-------------------------------------------- Now we have a template to use, and this is just one out of many templates you can make for your assembly programs. Now let's begin to have fun. These few programs will get us going for a basic feel of how to set up a basic hello program. What we will learn from this example is: 1) The basic mechanics of editing the template file to get an ASM source code file, assembling and linking it, and possibly fixing syntax errors. 2) Nearly all of the programs have loops in them, having different formats. 3) The operation of several INT 21H functions: 01H, 02H and 08H (character input and output), 09H(string output), and 4CH(program termination) 4) The operation of the DOSIOLT procedures: inhex16 and outhex16, and how to assemble and link a program that uses them. 5) Both string and numeric variables will be demonstrated. Creating an ASM file for the Message Program -------------------------------------------- To become familiar with the process of creating an assembly program, you will create a simple program that prints a one line message. As with most programming languages, Assembly programming starts with a plain text file containing the program instructions to execute. Ordinarily, a programmer would have to type in the entire source file from scratch. But 8086 assembler program files contain a large number of setup directives and declarations which are essentially the same for every program. It will be easier to start with a file that has all the necessary directives and declarations already in it, and just add to it the actual program parts. The file template.asm is that a template which contains all the necessary pieces of a program, except for the actual program itself. Make a copy of the template.asm file, and name it something appropriate: message.asm is a good choice. The file extension must be ASM. You will edit the new file to create your first program. DO NOT EDIT template.asm itself!!!! You will use this template file as the start of your assembly programs so it should not be alterated(until you get advanced enough to play around with it ;-). We will be using EDIT in a dos box as our editor, though you can use notepad or Ultraedit to edit your assembly files as well. The Comments All of the progras that you will write should have a descriptive set of header comments at the top. Any text AFTER a semicolon is considered a comment. The top of your new program file should already have the basic outline for this comment. Edit in your message.asm file to have something like this : ; PROGRAM : Message Program ; ; AUTHOR : Your Name here ; ; PURPOSE : This program simply prints a one line message ; to the screen ; PROGRAM OUTLINE : Use INT 21H Function 09H to print the message. This is just an example to help you know what you want to do, and to have a reference if you were to walk away from a project for a year or so...the header will make a nice reminder of what you were trying to get this program to do. The Ram Variable The program that you will creat in this part requires a variable. You will create a string of characters labeled message. The part of the file where all data is placed is the Data Segment. Look in your ASM file for the following lines: dseg segment para 'DATA' dseg ends Change this part of the code so that the message to be printed is defined. The result will look like: dseg segment para 'DATA' message db 0DH, 0AH, "WHOPPEEEE!!! My first Message.", 0DH, 0AH, "$" dseg ends The HEX values are the two-byte sequence for a DOS newline(CR-LF). The first characters of "0DH" and "0AH" is ZERO, no capital O. Note that there is NO semicolon before "message". Do not allow this part to break over two lines. THE Code - Now locate the part of the code where the program code goes. It should look like this: ;========================Main Program================================ ; program proc far ; Actual program code is completely ; contained in the FAR procedure ; named PROGRAM assume cs:cseg, ds:dseg, ss:sseg ; Set Data Segment Register to point to the Data Segment of this program mov ax,dseg mov ds,ax ;=============== Rest of MAIN PROGRAM code goes here ================== exit: mov ax,4c00h ; terminate program execution and int 21h ; transfer control to DOS program endp ; end of the procedure program ;============================ PROCEDURES ============================== cseg ends ; End of the code segment containing ; executable program. end program ; The final End statement signals the ; end of this program source file, and ; gives the starting address of the ; executable program All of the code for your program Should REPLACE the comment: "Rest of Main Program code goes here". Here is the code you will use to print out the message: ;Print the message mov dx, offset message mov ah, 09H int 21H This code just calls the DOS Interrupt used to print strings to the screen. Interrupt 21H is a general starting point for many useful DOS calls. The sub-function used to print strings is Function 09H; this value must be loaded into the AH register before calling. Also, Int 21H Function 09H requires the address of the message be placed in the DX register. The above code performs these two initialization tasks, and then calls the interrupt. Take careful note of the semicolons which start the comments. Also, do not alter any of the other part of the code. These were the only two changes you needed to make. Assembling with asm86.bat ------------------------- Now we have written our first asm file. To assemble with a86 you could try to use the switches from the manual that is included with the a86 package, or you can make things easy by using this batch file, which is designed for programs that use the template file. Here is the batch file: :------------------------------ASM86----------------------------------- @echo off REM This is a simple batch file to use a86 and link: if exist %1.asm GOTO FOUND echo %0 ERROR : %1.asm -- FILE NOT FOUND echo Usage: %0 file [link-file] GOTO STOP : FOUND :-- Assemble the program echo a86 +O +S +E %1.asm a86 +O +S +E %1.asm ::-- IF THERE WAS AN ERROR, STOP IF ERRORLEVEL 1 GOTO STOP ::-- IF there is a second file name, assume it is a OBJ file, ::-- and link it to the %1 name. IF X%2 == X GOTO ELSELINK ECHO link %1+%2; link %1+%2; GOTO ENDIFLINK :ELSELINK echo link %1; link %1; :ENDIFLINK :STOP and save this as asm86.bat. All this does is 1) create an object file (+O), 2) suppress the creation of the symbol table .sym(+S), and 3) copy the errors to a the filename.err instead of writing in your file(+E). To assemble the message.asm with the batch file, type asm86 message If there were any errors, you will have to edit the asm file to fix them. The error messages displayed by the assembler should indicate the line number and cause of the problem. Since you are just copying pregenerated code, any errors will simply be typos. Once all of the errors have been corrected, a pair of files will have been created. The will have the same base name as the original asm file, but will have different extensions: OBJ- Object file. Contains the basic machine code, but does not have any references to external procedures. This is, effectively, an intermediate file which is used by the linker to produce the final executable file. EXE- Executable file. All external references resolved. Completely runable. To run the program just type Message and you will see the line appear on the screen. This was a simple Hello program. What you probably want is another example or two to try out, and that is what we shall do. The next Program that won't be as long but will have plenty of info. CharLoop Program ---------------- In this part, you will create a simple program that asks the user to enter a character, and prints it out again. It does this repeatedly, until the user hits the ESC key. Dos funtions 01H and 02H are introduced with this program, and it is the first program containing a comparison loop. Again you should start by copying template.asm to a file called charloop.asm. Edit the charloop.asm template so that it has the following changes: Create two messages by adding the following lines to the Data Segment part of the program (see the message program instructions, if you don't remember how to do this): prompt db 0DH, 0AH,"Enter a Character: $" outmsg db 0DH, 0AH, "You Entered: $" Now add the code which will put the following "pseudocode" into effect: Repeat prompt for and read a character Print the character back out with a message While the character read is not esc Which will turn out to be the following assembly code: char_loop: ;Print the prompt mov dx, offset prompt mov ah, 09H int 21H ;Read a character into AL mov ah, 01H ;(01H - with echo; 08H - no echo) int 21H mov bl, al ;save character in BL ;Print the final message mov dx, offset outmsg mov ah, 09H int 21H ;Write the character to the screen mov dl, bl ;put character in dl mov ah, 02H int 21H ;Loop back, only if the character was not esc (1BH) cmp bl, 1BH jne char_loop ;End Repeat Note how the two new DOS interrupts are called. The Function number is always placed in AH before calling, and the INT 21H instruction is used to invoke the interrupt. For Function 1H, which reads a character to the screen, the DL register must be initialized with the appropiate value. Note also that the character must be stored somewhere throughout the whole loop, and it can NOT be stored either AL or DL -- AL is modified by Function 02H, and DL is modified when DX is set to the address of teh strings. So BL is used to store the character, and the value must be transferred between AL, BL and DL during processing. This kind of juggling happens often in assembly programming. Get this program running to watch another good program going ;-). CharLoop Program without Echo ----------------------------- In CharLoop program above. Function 01H was used to read a character from the keyboard. It does more than just read a character, it also echoes it back to the screen. This way, when you type something, you get visual feedback of what you have done. Function 08H works exactly the same as Function 01H, except for this echo feature: Function 08H does NOT echo the character after reading it. Create a new program which is exactly the same as CHARLOOP, except it should use Function 08H to read the characters, instead of Function 01H. Write and run the program to see how it works. NumLoop Program --------------- This program will work in a similar fashion to the Charloop program above, but it will read and print numbers. Since there is no DOS interrupt to convert ASCII characters to numbers, your code will have to do this. Fortunately, there are already procedures to do this. A few extra steps must be taken to use them, but it will be much easier than writing the code from scratch. See the info about DOSIOLT for details on how to use thes procedures. DOSIOLT Procedures Here is a description of the DOSIOLT procedures: inhex16 This procedure reads a HEX number in character format from the standard input, and converts it to a word. Spaces or Tabs may precede or follow then number. DOS int 21H-0AH is used to read the input string, so it must be terminated by a RETURN. Both upper and lower case letters A-F may be used. If the number typed is larger than FFFH, the upper bits are lost. If anything unpredictable is typed(like non-HEX chars) the function will return junk. Inputs: None Outputs: AX- the word-sized number read. Modifies: AX, flags outhex16 This simple routine prints the four 'nibbles' of AX as ASCII digits. Four digits are always printed. Input: AX- the number to be printed Outputs: None Modifies: Flags outHex8 This simple routine prints the two 'nibbles' of AL as ASCII digits. Two digits are always Printed. Input: AL-the number to be printed OUTPUT: NoneModifies: Flags Call Each of these procedures is invoked with the CALL instruction. Any inputs(registers) must be initialized before the call; any outputs(also registers) are set by the procedure, and contain the appropriate value after the call. For example, to print the 1-byte value "2F" to the screen: mov al, 2FH call outhex8 ;Prints: 2F To Print "2AC5" mov ax, 2AC5H call outhex16 ; print 2AC5 To read a number from the keyboard: call inhex16 ;The ax register now contains the number read Extern Since the code for these functions does NOT appear in your ASM file, two special steps must be taken in creating your executable file. The first is to declare the names of the procedures as external procedures. This informs the assembler that the code has been written elsewhere, and you didn't just forget to write it. The extern declaration should come someplace early in the ASM file. Although it doesn't matter greatly where it goes, most programmers will put these declarations outside of all of the segments. The template file given has a spot for externals, marked with a commment. The format for the declaration(in this case) is: extern procedure_name:far A86 USERS: The A86 Assembler uses the older version of the extern declaration, which is spelled extrn. If you are using the a86 assembler(asm86.bat), make sure you spell the name of the instruction extrn. procedure_name is the name of the procedure that you will use in the program. The name only needs to be declared once in this way, no matter how many times it is used. But if two or more DOSIOLT procedures are to be used, each must have a separate declaration. You should NOT place these extern declarations in your code unless you are actually using the routines. The linker may place the code for the procedure in your final executable even if it is never called. LINKING A special step must be taken in linking (the second half of the compilation phase done by asm86.bat) to link the code in DOSIOLT. Fortunately, asm86.bat can handle the extra file fairly automatically. Just include the DOSIOLT on the command line, after your asm file name. Example: assuming you have written a program in a file called "calc.asm" which contains calls to the DOSIOLT procedures. To assemble and link the program: A:\> asm86 calc dosiolt If you get an "Undefined Symbol" error, it is because you mistyped, or forgot, the extern declarations for the DOSIOLT procedures. Make sure these are correct. If you get an "Unresolved External" error, it is because you forgot to put "DOSIOLT" as the second file name; i.e. you typed: asm86 calc instead of asm86 calc dosiolt. This program will illustrate the use of two of the DOSIOLT procedures, and also the use of variables, rather than registers, as places to store information. The outline of the program is as follows: Loop forever Prompt for, and read a number into the variable NUMBER IF number = 0, then break out of the loop print Number, with an appropriate announcement. End Loop Your program will need a prompt string, a response string and a word-sized variable in the Data Segment: prompt1 db 0DH, 0AH, "Enter a number: $" response db 0DH, 0AH, "You Entered: $" number dw ? Number has been declared as a word-sized variable, with no initial value. The Code can now use the name "Number" just like a register name( in most cases). The code for the program is: number_loop: ;Print the first prompt mov dx, offset prompt1 mov ah, 09H int 21H ;Read a number into AX and put it in NUMBER call inhex16 mov number, ax ;If number = 0 the exit the loop cmp number, 0H je end_number_loop ;Print The second prompt. mov dx, offset response mov ah, 09H int 21H ;Print the number mov ax, number call outhex16 jmp number_loop end_number_loop: Note that the inhex16 reads a number into AX, and outhex16 prints the number AX, yet this code went through all the trouble of storing the number in the variable, rather than just leaving it in AX throughout the loop. WHY?!? Because AX was needed in between the reading and printing of the code. Again, this kind of juggling between registers and variables occurs often in assembly programming. Since two DOSIOLT procedures are being used, they must be declared. At the top you will find the EXTERN part of your program template; add these lines to the section: ;===============================Extern====================================== extrn inhex16:far extrn outhex16:far Those are all the changes needed. Don't forget to include the DOSIOLT file on the command line when compiling, which will be --- asm86 numloop dosiolt I do apologize for the length of this but I got to excited when I was messing with these old files and playing with these procedures in dosiolt.obj file. If you want to try to use these files, you can email me at linuxjr@hotmail.com and request the dosiolt.obj to use with the numloop; I will be more than happy to send it. ::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. :::\_____\:::::::::::...........................................FEATURE.ARTICLE Using the Gnu AS Assembler by mammon_ Using the Gnu AS Assembler mammon_ GAS is the GNU project port of the Unix AS assembler; it is available as part of the binutils package which is included with any of the GNU compilers (for example, GCC). GAS support is built into the various GNU compilers, and so GAS can be invoked by invoking the compiler on a .S (asm source) file; however it can also be run on any source file (for example, .asm files) by using the 'as' command. The GAS documentation is available on Linux installations in info (.gz) format, and is viewed using the command 'info as' or 'info -f as.info'. For the novice, a crash course in info: Info files are designed in a tree structure, with each page or section being considered a 'node'; h gets help, q quits info, SPACE scrolls down the screen, DEL scrolls up the screen, b jumps to the beginning of the node, e jumps to the end of the node, n jumps to the next node, p jumps to the previous node, g jumps to a specified node, m jumps to a specified menu item, s searches the info file, and l steps back 1 node. The sections of the most interest in the manual will be the Directives ('g Pseudo Ops'), Symbols ('g Symbols'), Constants ('g Constants'), and Sections ('g Sections') nodes. For more immediate references, the Intel 386- specific topics can be consulted: 'g i386-Syntax', 'g i386-Opcodes', 'g i386-Regs', 'g i386-prefixes', 'g i386-Memory', 'g i386-jumps'. The AT&T Syntax --------------- GAS uses the AT&T syntax, which is known to be confusing for those used to the Intel assembler syntax. It has been said that the AT&T syntax is less ambiguous than the Intel, and thus it has its own appeal. Registers One of the most obvious differences in syntax is that the registers in the AT&T syntax are prefixed with %. Thus, 'eax ax al ah' would be written '%eax %ax %al %ah' for GAS. Opcode Format and Order Unlike the Intel syntax which uses the format 'opcode dest, src', AT&T syntax uses the format 'opcode src, dest'; thus the command 'mov eax, ebx' in Intel would be 'mov %ebx, %eax' in AT&T. In addition, the opcodes in AT&T syntax all take suffixes to specify the size of the operand (note that these suffixes can be ignored usually, as GAS will guess the operand size by the size of the register being accessed)-- thus one would add 'w' to an opcode to specify a word operand, 'b' to specify a byte operand, and 'l' to specify a long operand. The Intel 'mov' opcode would then be specified in AT&T syntax by using 'movb', 'movw', or 'movl' as circumstances warrant. Note that this carries over into far calls; as the 'FAR" keyword is not present in GAS, one must prefix (not suffix) the call or jump with "l": thus a 'far call' becomes 'lcall', 'far jmp' becomes 'ljmp', and 'ret far' becomes 'lret'. Immediate and Absolute values Immediate values are prefixed with a $ in the AT&T syntax, while in the Intel syntax they are unmarked. Thus a 'push 4' statement becomes a 'push $4' in AT$T. Also, an absolute value is prefixed by a *, while in Intel it would be unmarked. Memory Referencing This is the part that is most likely to cause trouble for those used to the Intel syntax. Intel uses the following syntax for memory references: SECTION:[BASE + INDEX*SCALE + DISP] where BASE is the register used as a base in the reference, INDEX is a register used to calculate an offset, SCALE is the multiplier used to calculate the offset from the INDEX register, and DISP is the displacement from the BASE or INDEX register. Some examples from the GAS manual: [ebp - 4] [BASE DISP] (Note: DISP is -4) [foo + eax*4] [DISP + INDEX*SCALE] [foo] [DISP] (Value pointed to by 'foo') gs:foo SECTION:DISP (Contents of variable 'foo') AT&T syntax uses the following syntax for memoory references: SECTION:DISP(BASE, INDEX, SCALE) As with the Intel syntax, all of these are optional (and it appears that BASE and INDEX are rarely used together). The GAS manual provides the following examples equivalent to the above Intel examples: -4(%ebp) DISP(BASE) foo(,%eax,4) DISP(,INDEX,SCALE) foo(,1) DISP(,SCALE) (Note: the single comma is intentional) %gs:foo SECTION:DISP Note that you must provide commas within the parentheses whenever you skip an element (e.g., if you do not use BASE). To illustrate, here are some examples of memory references mixed in with asm opcodes (from http://www.castle.net/~avly/djasm.html): __AT&T______________________ __Intel_________________________ movl 4(%ebp), %eax mov eax, [ebp+4]) addl (%eax,%eax,4), %ecx add ecx, [eax + eax*4]) movb $4, %fs:(%eax) mov fs:eax, 4 movl _array(,%eax,4), %eax mov eax, [4*eax + array]) movw _array(%ebx,%eax,4), %cx mov cx, [ebx + 4*eax + array]) Labels & Symbols Labels in GAS are the same as in other assemblers: the name of the label followed by a colon. All symbol names must begin with a letter, a period, or an underscore. Local symbols are defined using the digits 0-9 followed by a colon, and are referred to using that digit followed by a b (for a backward reference) or f (for a forward reference); note that this allows only 10 local symbols. A symbol can be assigned a value using the equals sign (e.g. 'TRUE = 1') or by using the .set or .equ directives. Directives ---------- GAS allows most of the standard assembler directives; what follows are the most commonly used. .align Pad the section to a specified alignment (e.g. 4 bytes); this directive takes as an argument the alignment sized, as well as an optional argument specifying the byte used to fill the pad areas (default is 00). .ascii, .asciz, .string Each of these directives takes one or more strings separated by commas; in the .ascii directive, the strings are not terminated, in the .asciz and .string directives the strings are zero-terminated. .byte, .double, .int, .word Each of these directives takes as an argument an expression (for example, value1 + value2) and defines the specified number of bytes (byte, int, word, etc) at the current location to the result of the expression. .data, .section, .text The .section directive allows segments or sections of the target program to be defined for the linker. The .section directive takes a section name, as well as section flags (b = bss, w = writable, d = data, r = read-only, x = executable for COFF files; a = allocatable, w = writable, x = executable, @progbits = data, @nobits = no data for ELF files). The .data and .text directives are pre-defined .section directives for data and code sections. .equ, .set Each of these sets the first argument (a symbol) with the result of the second argument (an expression), for example .equ TRUE 1 sets the Symbol TRUE to the value 1. .extern The traditional EXTERN directive is available but ignored; GAS treats all undefined symbols as externs. .global, .globl These directives define global (exported) symbols; each takes as an argument the symbol to be made global. .if /.endif GAS provides the usual IF...ENDIF directives for conditional assembly; the .if directive is followed by an expression, and all code between the .if and the .endif directive is assembled only if that expression returns non-zero. .include This directive includes a file at the current location; it takes as an argument the name of the file in quotes, for example .include "stdio.inc" Assembling a Program -------------------- A GAS program can ge assembled by invoking GCC with the O2 (optimize: level 2) option. Note that all GAS programs must have a .text section and a global "main" label. Here is an example of a 'hello world'-style program in GAS: ; gashello.S ========================================================== .text message: .ascii "Helloooo, nurse!\0" .globl main main: pushl $message call puts popl %eax ret ; EOF ================================================================= This can be compiled with the command gcc -02 gashello.S -o ghello or with as gashello.S -o gashello.o ld -o gashello gashello.o -lc -s -defsym _start=main Note that it is much easier to use GCC than to use AS, as you will have to explicitly specify the librarys to link to (hence the -lc parameter) when you call LD. The Int80 "pid.asm" program from last month's Liux article would be written for GAS as follows: ;pid.S==================================================================== .global main .text szText1: .asciz "Getting Current Process ID..." szDone: .asciz "Done!" szError: .asciz "Error in int 80!" szOutput: .string "%d\n" main: pushl $szText1 call puts popl %ecx mov $20, %eax int $128 cmp $0, %eax je Error pushl %eax pushl $szOutput call printf popl %ecx popl %ecx pushl $szDone call puts jmp Exit Error: pushl $szError call puts Exit: popl %ecx ret ; EOF ==================================================================== This can be compiled in the same manner as the previous example; note, though, the need to use decimal numbers when calling interrupts (the 0x?? syntax for specifying a hexadecimal integer causes the opcode to not be recognized by the assembler). ::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. :::\_____\:::::::::::...........................................FEATURE.ARTICLE A Guide to NASM for TASM Coders by Gij Generalities ------------ The basic function of any assembler it to turn asm into the equivalent binary code file; that's true for TASM, NASM, and any other assembler. The differences arise in the special features each assembler offers you. For example, the MODEL directive exists in TASM, making it easier for the coder to reference data variables in other segments. NASM does not have an equivalent directive, so you have to keep track of the segment registers yourself, and put segment overrides where they are needed. This does not mean that NASM doesn't have good SEGMENT or GROUP support; in fact it has both, though they are not quite the same as in TASM. It's a different way of coding, and it may seem to require more work, but after you get used to it it's easier, because you know exactly what's going on in your code. NASM actually gives you the closest possible idea of what your asm source code will become once it's compiled. TASM is chock-full of directives; looking at a small reference for TASM 4.0, there are at least a few dozen directives TASM uses, and you have to know quite a bit of them by heart. NASM on the other hand has very few directives. Actually, you can write an asm file that will assemble just fine without using a single directive, although I doubt it will be useful in most cases. NASM is also less ambivalent towards syntax, which leaves less room for software bugs, but makes it more strict when assembling. I actually think NASM is easier to learn then TASM since it's much more straight-forward. Your NASM Bible is of course the accompanying docs, you can get them in a separate package from the same place you got the binaries for NASM. All in all I think you will find NASM to be just as capable as TASM if not more so. Although it's missing some features TASM has, you can always mail the author and ask for a feature, and you just might get lucky when the new version comes out. ASM code is usually the same in any assembler ( AT&T syntax is an exception ) but there are a few subtleties that TASM coders should look out for. The docs that accompany NASM have a nice list of them, and I'll mention the most significant ones here. DATA offset vs DATA contents ---------------------------- TASM uses this syntax to move mov esi, offset MyVar OR lea esi, MyVar LEA is used to load complex offsets like "[esi*4+ebx]" into a register. TASM supports LEA even when used with a simple offset like "Myvar". NASM on the other hand only supports one way of loading a simple offset into a register (the LEA form is only valid when using complex offsets): mov esi, MyVar This ALWAYS means move the offest of MyVar into esi. On the other hand, This: mov eax, [MyVar] Will always mean move the contents of MyVar into eax. However, using LEA to load a complex offset is valid in both TASM and NASM: lea edi,[esi*4+EBX] ; valid in both assemblers NASM also support a SEG keyword: mov ax,SEG MyVar This moves the segment of the variable into ax. Segment Overrides ----------------- TASM is more lax in it's syntax, so both of these are valid code: mov ax,ds:[si] AND mov ax,[ds:si] NASM doesn't allow this--if you specify a variable inside the square brackets all of the specifiers should be inside the square brackets. So this is the only valid option: mov ax,[ds:si] Specifying operand size ----------------------- TASM coders usually have lexical difficulties with NASM because it lacks the "ptr" keyword used extensively in TASM. TASM uses this: mov al, byte ptr [ds:si] OR mov ax, word ptr [ds:si] OR mov eax, dword ptr [ds:si] For NASM This simply translates into: mov al, byte [ds:si] OR mov ax, word [ds:si] OR mov eax, dword [ds:si] NASM allows these size keywords in many places, and thus gives you a lot of control over the generated opcodes in a uniform way. For example, the following are all valid: push dword 123 jmp [ds: word 1234] ; these both specify the size of the offset jmp [ds: dword 1234] ; for tricky code when interfacing 32bit and ; 16bit segments It can get pretty hairy with operand size being this final, but the important thing to remember is you can have all the control you need, when you want it. Functions --------- TASM has special directives for declaring a procedure and ending it. Why? A procedure is just another code label you CALL instead of JMP--NASM got it right. TASM uses: ProcName PROC xor ax,ax ret ProcName ENDP while NASM just uses: Procname: xor ax,ax ret To declare a procedure PUBLIC, just use the GLOBAL directive: GLOBAL Procname Procname: xor ax,ax ret Local Labels ------------ Those of you that know C also know that a member of a struct can be referenced as StructInstance.MemberName. This is rather similar to the way NASM allows you to use local labels. A Local Label is denoted by prefixing a dot to the label name: Label1: nop .local: nop Label2: nop .local: nop This won't give you an error on multiple definitions of label, but you can still jmp to a certain label like this: jmp Label2.local ...so it's local, and in a way it's also a global label. ORG Directive -------------- NASM supports the org directive, so if you are coding a COM file you can start with: org 0x100h OR org 100h (NASM allows both the asm and c methods of specifying hex, so both of the above are valid.) Reserving Space --------------- Once again, here NASM uses a different syntax then that of TASM. In TASM you would declare a 100 bytes of uninitialized space like this: Array1: db 100 dup (?) NASM uses its own keywords to do this; these are RESB, RESW and RESD, standing for REServeByte, REServeWord, and REServeDword, respectively. To reserve 10 bytes, you would use RES? keywords like this: Array1: RESB 100 OR Array1: RESW 100/2 OR Array1: RESD 100/4 Declaring initialized space is much like TASM, but arrays are different. In TASM: Array1: db 100 dup (1) In NASM: Array1: TIMES 100 db 1 TIMES is a handy little directive, it instructs NASM to preform an action a specified number of times, in the example above I preform "db 1" a 100 times. TIMES can be used for virtually anything; for example: TIMES 69 nop will put 69 nops at the current point in the file. The $ (current location) symbol is supported by NASM, and can be used to specify the 'count' operand to TIMES, so this is valid: label1: mov ax,1 xor ax,ax label2: TIMES $-label1 nop This expands to TIMES (label2 - label1), and will put as many one-byte nops after label2, as the byte count between label1 and label2. Making Structs -------------- I fought long and hard to get structs going, the docs were a bit vague, and it took a while to get it, here it is. Using a struct is divided into 2 parts, declaring the prototype, and making an instance. A simple, 2-member structure would be defined as follows: struc st stLong resd 1 stWord resw 1 endstruc this declares a prototype struct named st, with 2 members, stLong which is a DWORD, and stWord which is a word. It uses the reserve directives because it's a prototype, not a real struct. You can use istruc to make a real instance that you can reference as data in your code: mystruc: istruc st at stLong, dd 1 at stWord, dw 1 iend *Note: it's important to put the label on a different line. This creates a struct named mystruc of type st; the "at" keyword is used to assign initial values to the members of the struc (i.e., at the reserverd bytes of memory). The notation for referencing members is not like in C. This is because of the way structures are implemented; in NASM, each member is assigned an offset relative to the beginning of the struct: mystruc: istruc st at stLong, dd 1 ; offset 0 at stWord, dw 1 ; offset 4 iend The notation for referencing a member is therefore: mov eax, [mystruc+stLong] This is because mystruc is a constant base, and the member is a relative offset to it. It's similar to referencing a data array. One thing I should mention: If you declare structs prototypes as above, the member names/labels will be global, so you will get collisions if you use the same member name in your code or in another struct prototype. To avoid this, precede the member names with a dot '.', and then reference them in relation to the prototype's name in the instance declaration. For example: struc st .stLong resd 1 .stWord resw 1 endstruc mystruc: istruc st at st.stLong, dd 1 at st.stWord, dw 1 iend And this is how you reference the members in code: mov eax,[mystruc+st.stWord] This may seem confusing; you should understand that "mystruc" is the base of a particular instance, and "st.stLong" is an offset relative to the start of the struct, so in pseudo-code it translates into: mov eax,[offset mystruc + (offset stWord-offset start_of_proto] or mov eax,[offset mystruc + 4] ...which gives you the correct offset for the stWord member of the "mystruc" struct instance. Using Macros ------------ This is a large part of the nasm docs, and a bit too much to get into in depth here. I'll try and cover the major issues. There are 2 types of macros, one-line and multi-line, all macro keywords are preceeded with a '%' character. An example of a single-line macro: %define mul(a,b) (a*b) ...which would be reference in the source code as follows: mov eax,mul(2,3) This will be converted into: mov eax,6 You can invoke other macros from within a macro: %define fancymul(a,b) ( a * triple_mul(4) ) %define triple_mul(a) (a*3) mov eax,fancymul(2,3) This becomes: mov eax, ( 2 * ( 3 * 4 ) ) These are not very useful examples, but i'm sure you can see the potential. Multi-Line macros are much the same as single-line macros, but the syntax is a bit different: %macro name number_of_args <body of macro> %endmacro So, for example, if you wanted to make a small asm effort-saver you could write the following macro: %macro prologue 1 push ebp mov ebp,esp sub esp,%1 %endmacro ...and then you can use it in your code like this: DemoFunc: prologue 4*2 <body of function> This would set up a stack frame and reserve room for 2 DWORD local variables. You'll notice that args supplied to the macro can be referenced as %1....%n, similar to DOS and Unix shell/batch programming. This is just a quick taste, there's more to be learned about NASM macros: the docs are your friends. Includes -------- Including files is easy, If you want to include .inc's into your asm file you can use: %include "win32.inc" If you wish to include binary files, you must use a different keyword: INCBIN "data.bin" Conditional Assembly -------------------- NASM also has support for conditional assembly: %define INCLUDE_WIN32_INCS %ifdef INCLUDE_WIN32_INCS %include "win32.inc" %include "toolhelp.inc" %include "messages.inc" %endif This way you can control the inclusion of files defining on the command line: "nasmw -dINCLUDE_WIN32_INC" or by commenting out the %define line. The body of the %ifdef will be processed only if a macro/define named INCLUDE_WIN32_INCS is defined. Externs, Globals and Commons ----------------------------- When Coding a multi-source-files project, writing a dll, or calling API functions you need to declare various symbols/data/functions a certain type to make them available to the Assembler and you. There are 3 types of symbols in NASM: EXTERN, GLOBAL and COMMON. Their invocation is all the same: EXTERN symbol_name ; use this to define API calls for use GLOBAL symbol_name COMMON symbol_name They all must appear before the actual symbol is defined/referenced. If you have experience in asm/c, their use should be clear -- EXTERN declares an external reference ofr the linker to resolve (an "import"), GLOBAL declares a symbol to be globally/publicly available (an "export"), and COMMON declares a variable to be of Common data type (i.e., all instances of a COMMON variable are merged into a single instance during compilation). NASM 0.97 also has IMPORT/EXPORT extensions to the .obj format, for writing DLL's; read the docs for more info. Specifying Segment Type ----------------------- You can declare segments much the same as you would in TASM: segment .data use32 CLASS=data or segment .text use32 CLASS=code or segment Gij use16 CLASS=code This is a good way to set segments straight for linking. Note that Nasm does not require certain segments to be present: you have full control over the segmentation of the program. Output Formats -------------- Nasm supports a plethora of output formats; depending on what you are trying to accomplish, you should read the docs for special extensions to each type. The output format is chosen using "nasm -f type" on the command line, where type can be bin, obj, win32 and others. Each linker likes different formats--tlink likes obj for example, while LCC-WIN32 likes the win32 format...investigate on your own to find the best output format for your linker. *tip: when assembling into the "obj" type, make sure and use the special "..start:" symbol to specify the entry point for the file. In Closing ---------- That's all for now. This is intended to be a 'quick-start' guide for TASM coders who want --or need-- to move into NASM; it is not a substitute for the NASM documentation. If you need to reach me, my e-mail is gij <at> bigfoot.com Enjoy NASM! ::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. :::\_____\:::::::::::...........................................FEATURE.ARTICLE Tips on Saving Bytes in ASM Programs by Larry Hammick The programmer's word for craftsmanship is "optimization". This term refers to conservation, either of program size or execution time. It's time includes not just CPU clocks, but the time consumed by peripherals (e.g. disks, at load time) and by the operating system calls. This article is concerned with the conservation of size, or bytes. Size may refer either to the program file size, or to the size of the memory the program uses. The two are not always identical. In all the illustrations, we assume that 16-bit code segments are involved. The syntax we use is that of MASM 5.1; the difference from other assemblers is slight. 1. Avoid uninitialized data. --------------------------- An instruction like this: OutputHandle dw ? is usually a waste of space. Depending on the memory model (i.e. depending on whether we have CS=DS, and the like), there are several ways to omit these two bytes from the program file and the memory image. If DS is the PSP segment, use: OutputHandle equ word ptr ds:[5Ch] or similar, for a value other than 5Ch. Any program may safely use any part of the PSP from 3Ch to 07Fh, plus the word at 2Ch (environment segment). When the program is finished with the command tail (bytes 80h-0FFh), it can reuse that area as well. Other parts of the PSP should not be modified, because they may be needed by DOS when the program exits. However, in the case of a TSR, the stay-resident part of the code (e.g. an interrupt handler) may use any part of the PSP after the TSR exit has been executed. In such cases, the PSP makes a handy buffer of 100h bytes with ORG 0. If DS=CS, you can define uninitialized variables like this: OutputHandle dw ? InputHandle dw ? ORG OutputHandle Go: mov ah,30h int 21h ... mov OutputHandle,ax ... END Go or, equivalently: OutputHandle equ word ptr ds:[Go] InputHandle equ word ptr ds:[Go+2]. If DS is a dynamically allocated segment, or if it is part of the stack, there is this trick: OutputHandle equ word ptr ds:[0] InputHandle equ word ptr ds:[2]. Allocating file and memory space just for uninitialized variables wastes a few bytes here and there. Much worse, for file size, is to put whole buffers and stacks in the file: ReadBuffer db 1000h dup (0) Stack db 40h dup ("--Stack!--") Examine a few commercial programs under a hex editor or debugger to see how common this practice is. Worldwide, the quantity of disk space thus wasted must be astronomical. Moreover, such "data" gets copied from disk every time the program is loaded, even though it has no meaning! Perhaps assemblers and linkers will someday be smart enough to avoid this. For now, we do have EXE packers such as PKLite to compress blank data blocks, but the latter can be avoided entirely as follows. If DS is a dynamic segment or part of the stack: BufferSize equ 1000h ReadBuffer equ 0 WriteBuffer equ ReadBuffer+BufferSize ... mov dx,ReadBuff ;rather than mov dx,offset ReadBuff mov ah,3Fh int 21h ... If the program will be small enough for the code and all data to fit in one segment, it is desirable to have CS=DS. Then you can do: ReadBuffer equ offset EndOfCode WriteBuffer equ ReadBuffer+BufferSize Go: ... ;code instructions mov ah,4Ch int 21h ;exit EndOfCode label byte END Go This practice is not quite safe for a COM program, because DOS will load a COM file into less than 64K if no larger block is available or if memory is fragmented. For an EXE, the EXE header can be adjusted to prevent the program from loading into too little memory. 2. Put related data together. ---------------------------- An example: CursorPosition label word CursorColumn db 0Eh CursorRow db 8 You will be able to load or save both variables with one instruction: mov dx,CursorPosition Another benefit: and CursorPosition,0FF00h jnz NotAtTop The AND instruction sets one byte and tests another, at the same time. 3. Avoid forward references. --------------------------- Forward references in source can result in worthless NOP's getting assebled. This is another illustration of the principle that assemblers are pretty dumb. Consider: mov cx,MsgSize ;(1) ... Msg db "Hello",0Dh,0Ah MsgSize equ $-offset Msg MsgSize is a constant word. But MASM doen't know that when it assembles the instruction (1). So it provides 3 bytes for MsgSize, and later fills in the constant word followed by a NOP byte. One solution: db 0B9h ;opcode for mov cx,immed dw MsgSize ... Msg db "Hello",0Dh,0Ah MsgSize equ $-offset Msg 4. Use cheap opcodes. -------------------- 4.1 XCHG AX,Reg16 These 8 instructions are each just 1 byte. Don't use either MOV AX,CX or MOV CX,AX unless you need the same value in both registers. AX is special in this respect; instructions such as XCHG BX,CX or XCHG SI,DI are two bytes. XCHG EAX,Reg32 is two bytes (in 16-bit code segments), whereas MOV EAX,ECX etc. is three. 4.2 CBW, CDW, CDQ To put AH=0, the instructions xor ah,ah sub ah,ah mov ah,0 occupy two bytes each. But if you know that AL > 0, the instruction CBW has the same effect (except that it leaves the flags unchanged) and is only one byte. Likewise, CWD can save over XOR DX,DX. CDQ is a 2-byte opcode but still better than XOR EDX,EDX, which is 3 bytes. 4.3 JCXZ This instruction does not require a preliminary flag-setting instruction. So, you might prefer xchg ax,cx jcxz Mylabel to or ax,ax jz MyLabel, saving one byte. Be aware that JCXZ is a relatively slow opcode. 4.4 INC Reg16 and DEC Reg16 These 16 opcodes are just one byte each. The opcodes INC Reg8 and DEC Reg8 are 2-byte. So use INC CX instead of INC CL if there is no possibility of carry from CL into CH. If CX is known to be 0, INC CX saves a byte vs. MOV CL,1, and 2 bytes vs. MOV CX,1. Similar tricks apply to going from -1 to 0, to decrement- ing from 1 to 0 or from 0 to -1. 4.5 Prefer the accumulator to other registers. The following opcodes, among others, are cheaper for AX or EAX than for other general registers. MOV reg,mem MOV mem,reg ADD reg,mem 5. Be flexible on flow control. ------------------------------ Block-structuring is very sensible in high-level languages, but in ASM it is little more than a pedantic habit. In ASM, a routine may have more than one entry point and more than one exit (RETN, RETF, or IRET). Several routines may share exit code or entry code. A routine need not return at all. A few examples of how this can save bytes: 5.1 Discard return addresses that won't be needed. This sort of thing appears often: Mysub: cmp al,3 ja StcRet ... ret StcRet: stc ret ... call MySub jc Ret1 ... Ret1: ret Better is: Mysub: cmp al,3 ja DontRet ... ret DontRet: pop ax ;discard return address into some unneeded register ret ... call MySub ;returns only if input is okay ... 5.2 Reuse exit code. If you see this more than once in your source: pop bx pop dx pop ax retn, make a label at POP BX, and use a jump to that label from each other occurrence. If this happens more than once: push ax push cx push dx push bx consider a subroutine: SaveRegs: pop si ;store return address in an unneeded register push ax push cx push dx push bx jmp si 5.3. Consider CALL instead of JMP. The CALL instruction can be used instead of JMP to pass a near address at almost no cost. mov ah,30h int 21h cmp al,5 jae EnoughDOS call ErrExit db "This program requires DOS 5+",13,10,0 EnoughDOS: ... ErrExit: pop si ;"Return address" actually points at data. ErrExitLoop: lodsb or al,al jz Exit int 29h jmp short ErrExitLoop Exit: mov ax,4CFFh int 21h In the above example, the routine ErrExit writes an ASCIIZ string from CS:SI, then exits. The offset of a jump table can sometimes be passed in the same way. call SmartJump ;does not return db 3 dw Handle3 ;Handle3 and Handle7 are near code addresses db 7 dw Handle7 db 0 ;terminator for the table SmartJump: ;input is a jump table index AL. pop di ;"return address" actually points at the jump table SmartJumpLoop: cmp byte ptr[di],0 je NotFound scasb je Found scasw ;cheaper than incrementing di twice jmp short SmartJumpLoop Found: jmp word ptr es:[di] NotFound: ... The above example assumes ES=CS. 5.4 Short jumps are cheaper than near jumps. You can often save a few bytes by arranging your source so that jumps are short rather than near. If this occurs: cmp al,5 jne Not5 jmp CantRun Not5: ... jmp CantRun ... and CantRun is not reachable by a short jump in either instance, you might still save a byte like so: cmp al,5 jne Not5 JmpCantRun: jmp CantRun Not5: ... jmp short JmpCantRun ;2-step jump ... 6. Registers are cheaper than constants. --------------------------------------- You should never write this (6 bytes): mov si,StringSite ;a 16-bit constant mov di,StringSite Instead (5 bytes): mov si,StringSite mov di,si. Another illustration: MyByte db 11h ... mov MyByte,0 ;a 5-byte instruction mov MyByte,bh ;4 bytes, and equivalent if bh is known to be 0 mov MyByte,al ;only 3 bytes. 7. Code can be used as data. --------------------------- Here are two examples of a slick technique known as self-modifying code. ErrExit: call WriteMessage db 0B8h ;code for MOV AX,Immed16 ReturnCode db ?,4Ch int 21h ;exit from program The label ErrExit can be reached by JMP's from several points in the program. Before jumping, the code pokes in a suitable value of ReturnCode, depending on the type of error condition encountered. The above example uses part of the instruction MOV AX,4Cxxh as a variable, saving bytes. mov ax,252Fh ;get INT 2Fh vector as ES:BX int 21h mov OldInt2F,bx ;this example assumes CS=DS at this point mov OldInt2F[2],es mov dx,offset OurInt2F mov ax,252Fh ;set INT 2Fh vector to DS:DX ... OurInt2F: cmp ax,1211h ;a function that we want to control jne short JmpOldInt2F ... (handle this function) iret JmpOldInt2F: db 0eah ;opcode for jump to immediate far address OldInt2F dw ?,? This manoeuvre saves bytes versus JMP DWORD PTR OldInt2F; again, the method is by putting the variable (OldInt2F) right inside the code. Device drivers and other TSR's should use this trick, but I don't know of a single one which does (except my own, naturally). Safe use of self-modifying code requires some awareness of on-chip instruction caches. It's no good to modify code in memory if what will get executed is already on the CPU. The following trick, however, is quite safe. Instead of: ErrExit2: mov al,2 jmp short ErrExit ErrExit3: mov al,3 jmp short ErrExit ErrExit5: mov al,5 ErrExit: mov ah,4Ch int 21h write: ErrExit2: mov al,2 db 3Dh ;opcode for CMP AX,immed, to disable the following ErrExit3: mov al,3 ;2-byte instruction db 3Dh ErrExit5: mov al,5 ErrExit: mov ah,4Ch int 21h 8. Miscellaneous byte-savers. ---------------------------- Since the instruction sets of the x86 CPU's are so elaborate, there are many more ad hoc ways to reduce, reuse, and recycle bytes. The following are only a few. 8.1 After a loop, CX is 0. Thus mov cx,1234h MyLoop: ... ... loop MyLoop mov cx,56h ... is wasteful. The last instruction should be mov cl,56h. 8.2 Use conditional MOV's. cmp VideoMode,7 je BlackAndWhite mov dx,0B800h jmp short Either BlackAndWhite: mov dx, 0B000h Either: ... The above code wastes bytes. Better is: mov dx, 0B800h cmp VideoMode,7 jne GotVideoBase mov dh,0B0h GotVideoBase: ... The improved version has one jump instruction instead of two, and in this example saves an additional byte by resetting only DH, not DX. With the Pentium, Intel introduced a useful set of conditional mov's right into the instruction set. 8.3 To test the high bit of a register, avoid the constants 80h and 8000h. For example, test dh,80h jnz MyLabel is 5 bytes, but or dh,dh js MyLabel is 4. The latter instruction also leaves more information in the flags. TEST DH,DH or AND DH,DH have the same effect as OR DH,DH. 8.4 To determine if several variables of the same size are all 0, OR them together, and the zero flag will tell you. To determine if they are all -1, AND them together and increment the result. 9. Postlude ----------- Intel makes their excellent CPU documentation available free, from: http://developer.intel.com/design/litcentr/index.htm It is in Adobe PDF format; you will need the Acrobat Reader, also free, from: http://www.adobe.com/prodindex/acrobat/readstep.html If all else fails, you can try to wake me up at: hammick@bc.sympatico.ca Regards from Vancouver, Larry ::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. :::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING A Simple Window by Iczelion In this tutorial, we will build a Windows program that displays a fully functional window on the desktop. Download the example file here. http://203.148.211.201/iczelion/files/tut03.zip Preliminary: Windows programs rely heavily on API functions for their GUI. This approach benefits both users and programmers. For users, they don't have to learn how to navigate the GUI of each new programs, the GUI of Windows programs are alike. For programmers, the GUI codes are already there,tested, and ready for use. The downside for programmers is the increased complexity involved. In order to create or manipulate any GUI objects such as windows, menu or icons, programmers must follow a strict recipe. But that can be overcome by modular programming or OOP paradigm. I'll outline the steps required to create a window on the desktop below: 1.Get the instance handle of your program (required) 2.Get the command line (not required unless your program receives command line) 3.Register window class (required ,unless you use predefined window types, eg. MessageBox) 4.Create the window (required) 5.Show the window on the desktop (required unless you don't want to show the window immediately) 6.Refresh the client area of the window 7.Enter an infinite loop, checking for message from Windows 8.If messages arrive, they are processed by a specialized function that is responsible for the window 9.Quit program if the user closes the window As you can see, the structure of a Windows program is rather complex compared to a DOS program. But the world of Windows is drastically different from the world of DOS. Windows programs must be able to coexist peacefully with each other. They must follow stricter rules. You, as a programmer, must also be more strict with your programming style and habit. Content: Below is the source code of our simple window program. Before jumping into the gory details of Win32 ASM programming, I'll point out some fine points which'll ease your programming. You should put all Windows constants, structures and function prototypes in an include file and include it at the beginning of your .asm file. It'll save you a lot of effort and avoid typing errors. Most of the time, you can use include file from some Win32 asm examples. I have used windows.inc from Steve Gibson's Small Is Beautiful exampleand made some additions of my own. Use IncludeLib directive to specify the import library used in your program. For example, if your program calls MessageBoxA, you should put the line: IncludeLib user32.lib at the beginning of your .asm file. This directive tells MASM that your program will make usesof functions in that import library. If your program calls functions in more than one library, just add an includelib for each library you use. Using IncludeLib directive, you don't have to worry about import libraries at link time. You can use the /LIBPATH linker switch to tell Link where all the libs are. When declaring API function prototypes, structures, or constants in your include file, try to stick to the original names used in Windows include files, including case. This will save you a lot of headache when looking up some item in Win32 API reference. Use makefile to automate your assembling process. This will save you a lot of typing. ; ============================================================================= include windows.inc ; .386 and .model are already declared in windows.inc includelib user32.lib ; calls to functions in user32.lib and kernel32.lib includelib kernel32.lib .DATA ; initialized data ClassName db "SimpleWinClass",0 ; the name of our window class AppName db "Our First Window",0 ; the name of our window .DATA? ; Uninitialized data hInstance HINSTANCE ? ; Instance handle of our program CommandLine LPSTR ? .CODE ; Here begins our code start: invoke GetModuleHandle, NULL ; get the instance handle of our program. ; Under Win32, hmodule==hinstance mov hInstance,eax invoke GetCommandLine ; get the command line. mov CommandLine,eax invoke WinMain, hInstance,NULL,CommandLine, SW_SHOWDEFAULT ; call Winmain invoke ExitProcess,eax ; quit our program. The exit code is ; returned in eax from WinMain. WinMain proc hInst:HINSTANCE,hPrevInst:HINSTANCE,CmdLine:LPSTR,CmdShow:SDWORD LOCAL wc:WNDCLASSEX ; create local variables on stack LOCAL msg:MSG LOCAL hwnd:HWND mov wc.cbSize,SIZEOF WNDCLASSEX ; fill values in members of wc mov wc.style, CS_HREDRAW or CS_VREDRAW mov wc.lpfnWndProc, OFFSET WndProc mov wc.cbClsExtra,NULL mov wc.cbWndExtra,NULL push hInstance pop wc.hInstance mov wc.hbrBackground,COLOR_WINDOW+1 mov wc.lpszMenuName,NULL mov wc.lpszClassName,OFFSET ClassName invoke LoadIcon,NULL,IDI_APPLICATION mov wc.hIcon,eax mov wc.hIconSm,0 invoke LoadCursor,NULL,IDC_ARROW mov wc.hCursor,eax invoke RegisterClassEx, addr wc ; register our window class invoke CreateWindowEx,NULL,\ ADDR ClassName,\ ADDR AppName,\ WS_OVERLAPPEDWINDOW,\ CW_USEDEFAULT,\ CW_USEDEFAULT,\ CW_USEDEFAULT,\ CW_USEDEFAULT,\ NULL,\ NULL,\ hInst,\ NULL mov hwnd,eax invoke ShowWindow, hwnd,CmdShow ; display our window on desktop invoke UpdateWindow, hwnd ; refresh the client area .WHILE TRUE ; Enter message loop invoke GetMessage, ADDR msg,NULL,0,0 .BREAK .IF (!eax) invoke TranslateMessage, ADDR msg invoke DispatchMessage, ADDR msg .ENDW mov eax,msg.wParam ; return exit code in eax ret WinMain endp WndProc proc hWnd:HWND, uMsg:UINT, wParam:WPARAM, lParam:LPARAM mov eax,uMsg ; put the window message in eax .IF eax==WM_DESTROY ; if the user closes our window invoke PostQuitMessage,NULL ; quit our application xor eax,eax .ELSE ; Default message processing invoke DefWindowProc,hWnd,uMsg,wParam,lParam .ENDIF ret WndProc endp end start You may be taken aback that a simple Windows program requires so much coding. But most of these codes are just *template* codes that you can copy from one source code to another. Or, if you prefer, you could assemble some of these codes into a library to be used as prologue and epilogue codes. You can write only the codes in WinMain function. In fact, this is what C compilers do. They let you write WinMain codes without worrying about other housekeeping chores. The only catch is that you must have a function named WinMain else C compilers will not be able to combine your codes with the prologue and epilogue. You do not have such restriction with assembly language. You can use any function name instead of WinMain or no function at all. Prepare yourself. This is going to be a long, long tutorial. Let's analyze this program to death! include windows.inc includelib user32.lib includelib kernel32.lib We must include windows.inc at the beginning of the source code. It contains important API function prototypes, structures and constants that are used by our program. The include file , windows.inc, is just a text file. You can open it with any text editor. The first two lines are .386 and .model directives, so you don't have to specify these two lines at the beginning of the source code. Next are several macros that its author (Steve Gibson) frequently uses. The remaining of the file contains important structures, constants and function prototypes. Please note that windows.inc does not contain all structures, constants, and function prototypes of Windows. It just holds the most frequently used ones. You can add in new items if they are not in the file. Our program calls API functions that reside in user32.dll (CreateWindowEx, RegisterWindowClassEx, for example) and kernel32.dll (ExitProcess), so we must link our program to those two import libraries. The next question : how can I know which import library should be linked to my program? The answer: You must know where the API functions called by your program reside. For example, if you call an API function in gdi32.dll, you must link with gdi32.lib. This is the approach of MASM. TASM 's way of import library linking is much more simpler: just link to one and only one file: import32.lib. .DATA ClassName db "SimpleWinClass",0 AppName db "Our First Window",0 .DATA? hInstance HINSTANCE ? CommandLine LPSTR ? Next are the "DATA" sections. In .DATA, we declare two zero-terminated strings(ASCIIZ strings): ClassName which is the name of our window class and AppName which is the name of our window. Note that the two variables are initialized. In .DATA?, three variables are declared: hInstance (instance handle of our program), CommandLine (command line of our program), and CommandShow (state of our window on first appearance). The unfamiliar data types, HINSTANCE and LPSTR, are really new names for DWORD. You can look them up in windows.inc. Note that all variables in .DATA? section are not initialized, that is, they don't have to hold any specific value on startup, but we want to reserve the space for future use. .CODE start: invoke GetModuleHandle, NULL mov hInstance,eax invoke GetCommandLine mov CommandLine,eax invoke WinMain, hInstance,NULL,CommandLine, SW_SHOWDEFAULT invoke ExitProcess,eax ..... end start .CODE contains all your instructions. Your codes must reside between <starting label>: and end <starting label>. The name of the label is unimportant. You can name it anything you like so long as it doesn't violate the naming convention of MASM. Our first instruction is the call to GetModuleHandle to retrieve the instance handle of our program. Under Win32, instance handle and module handle are one and the same. You can think of instance handle as the ID of your program. It is used as parameter to several API functions our program must call, so it's generally a good idea to retrieve it at the beginning of our program. Upon return from a Win32 function, the function return value, if any, can be found in eax. All other values are returned through variables passed in the function parameter list you defined for the call. A Win32 function that you call will always preserve the segment registers and the ebx, edi, esi and ebp registers. Conversely, ecx and edx are considered scratch registers and are always undefined upon return from a Win32 function. The bottom line is that: when calling an API function, expect return value in eax. If any of your function will be called by Windows, you must also play by the rule: preserve and restore the values of the segment registers, ebx, edi, esi and ebp upon function return else your program will crash very shortly. The GetCommandLine call is unnecessary if your program doesn't process a command line. In this example, I show you how to call it in case you need it in your program. Next is the WinMain() call. Here it receives four parameters: the instance handle of our program, the instance handle of the previous instance of our program, the command line and window state at first appearance. Under Win32, there's NO previous instance. Each program is alone in its address space, so the value of hPrevInst is always 0. This is a lefover from the day of Win16. Note: You don't have to declare the function name as WinMain. In fact, you have complete freedom in this regard. You don't have to use any WinMain-equivalent function at all. You can paste the codes in WinMain next to GetCommandLine and your program will still be able to function perfectly. Upon return from WinMain, eax is filled with exit code. We pass that exit code as parameter to ExitProcess which terminates our application. WinMain proc Inst:HINSTANCE,hPrevInst:HINSTANCE,CmdLine:LPSTR,CmdShow:SDWORD The above line is the function declaration of WinMain. Note the parameter:type pairs that follow PROC directive. They are parameters that WinMain receives from the caller. You can refer to these parameters by name instead of by stack manipulation. In addition, MASM will generate the prologue and epilogue codes for the function. So we don't have to concern ourselves with stack frame on function enter and exit. LOCAL wc:WNDCLASSEX LOCAL msg:MSG LOCAL hwnd:HWND LOCAL directive allocates memory from the stack for local variables used in the function. The LOCAL directive is immediately followed by <the name of local variable>:<variable type>. So LOCAL wc:WNDCLASSEX tells MASM to allocate memory from the stack the size of WNDCLASSEX structure for the variable named wc. We can refer to wc in our codes without any difficulty involved in stack manipulation. That's really a godsend, I think. The downside is that local variables cannot be used outside the function they're created and will be automatically destroyed when the function returns to the caller. Another drawback is that you cannot initialize local variables automatically because they're just stack memory allocated dynamically on function start. You have to manually assign them with desired values after LOCAL directives. mov wc.cbSize,SIZEOF WNDCLASSEX mov wc.style, CS_HREDRAW or CS_VREDRAW mov wc.lpfnWndProc, OFFSET WndProc mov wc.cbClsExtra,NULL mov wc.cbWndExtra,NULL push hInstance pop wc.hInstance mov wc.hbrBackground,COLOR_WINDOW+1 mov wc.lpszMenuName,NULL mov wc.lpszClassName,OFFSET ClassName invoke LoadIcon,NULL,IDI_APPLICATION mov wc.hIcon,eax mov wc.hIconSm,0 invoke LoadCursor,NULL,IDC_ARROW mov wc.hCursor,eax invoke RegisterClassEx, addr wc ; register our window class The inimidating lines above are really simple in concept. It just takes several lines of instruction to accomplish. The concept behind all these lines is window class. A window class is nothing more than a blueprint or specification of a window. It defines several important characteristics of a window such as its icon, its cursor, the function responsible for it, its color etc. You create a window from a window class. This is some sort of object oriented concept. If you want to create more than one window with the same character- istics, it stands to reason to store all these characteristics in only one place and refer to them when needed. This scheme will save lots of memory by avoiding duplication of information. Remember, Windows is designed in the past when memory chips are prohibitive and most computers have 1 MB of memory. Windows must be very efficient in using the scarce memory resource. The point is: if you define your own window, you must fill the desired characteristics of your window in a WNDCLASS or WNDCLASSEX structure and call RegisterClass or RegisterClassEx before you're able to create your window. You only have to register the window class once for each window type you want to create a window from. Windows have several predefined Window classes, such as button and edit box. For these windows (or controls), you don't have to register a window class, just call CreateWindowEx with the predefined class name. The single most important member in the WNDCLASSEX is lpfnWndProc. lpfn stands for long pointer to function. Under Win32, there's no "near" or "far" pointer, just pointer because of the new FLAT memory model. But this is again a lefover from the day of Win16. Each window class must be associated with a function called window procedure. The window procedure is responsible for message handling of all windows created from the associated window class. Windows will send messages to the window procedure to notify it of important events concerning the windows it 's responsible for,such as user keyboard or mouse input. It's up to the window procedure to respond intelligently to each window message it receives. You will spend most of your time writing event handlers in window procedure. I'll describe each member of WNDCLASSEX below: typedef struct tagWNDCLASSEX { UINT cbSize; UINT style; WNDPROC lpfnWndProc; int cbClsExtra; int cbWndExtra; HINSTANCE hInstance; HICON hIcon; HCURSOR hCursor; HBRUSH hbrBackground; LPCSTR lpszMenuName; LPCSTR lpszClassName; HICON hIconSm; } WNDCLASSEX; cbSize: The size of WNDCLASSEX structure in bytes. We can use SIZEOF operator to get the value. style: The style of windows created from this class. You can combine several styles together using "or" operator. lpfnWndProc: The address of the window procedure responsible for windows created from this class. cbClsExtra: Specifies the number of extra bytes to allocate following the window-classstructure. The operating system initializes the bytes to zero. cbWndExtra: Specifies the number of extra bytes to allocate following the window instance. The operating system initializes the bytes to zero. If an application uses the WNDCLASS structure to register a dialog box created by using the CLASS directive in the resource file, it must set this member to DLGWINDOWEXTRA. hInstance: Instance handle of the module. hIcon: Handle to the icon. Get it from LoadIcon call. hCursor: Handle to the cursor. Get it from LoadCursor call. hbrBackground: Background color of windows created from the class. lpszMenuName: Default menu handle for windows created from the class. lpszClassName: The name of this window class. hIconSm: Handle to a small icon that is associated with the window class. If this member is NULL, the system searches the icon resource specified by the hIcon member for an icon of the appropriate size to use as the small icon. invoke CreateWindowEx, NULL,\ ADDR ClassName,\ ADDR AppName,\ WS_OVERLAPPEDWINDOW,\ CW_USEDEFAULT,\ CW_USEDEFAULT,\ CW_USEDEFAULT,\ CW_USEDEFAULT,\ NULL,\ NULL,\ hInst,\ NULL After registering the window class, we can call CreateWindowEx to create our window based on the submitted window class. Notice that there're 12 parameters to this function. C function prototype of CreateWindowEx is below: HWND WINAPI CreateWindowExA( DWORD dwExStyle, LPCSTR lpClassName, LPCSTR lpWindowName, DWORD dwStyle, int X, int Y, int nWidth, int nHeight, HWND hWndParent , HMENU hMenu, HINSTANCE hInstance, LPVOID lpParam); Let's see detailed description of each parameter: dwExStyle: Extra window styles. This is the new parameter that is added to the old CreateWindow. You can put new window styles for Windows 95 & NT here. You can specify your ordinary window style in dwStyle but if you want some special styles such as topmost window, you must specify them here. You can use NULL if you don't want extra window styles. lpClassName: (Required). Address of the ASCIIZ string containing the name of window class you want to use as template for this window. The Class can be your own registered class or predefined window class. As stated above, every window you created must be based on a window class. lpWindowName: Address of the ASCIIZ string containing the name of the window. It'll be shown on the title bar of the window. If this parameter is NULL, the title bar of the window will be blank. dwStyle: Styles of the window. You can specify the appearance of the window here. Passing NULL is ok but the window will have no system menu box, no minimize-maximize buttons, and no close-window button. The window would not be of much use at all. You will need to press Alt+F4 to close it. The most common window style is WS_OVERLAPPEDWINDOW. A window style is only a bit flag. Thus you can combine several window styles by "or" operator to achieve the desired appearance of the window. WS_OVERLAPPEDWINDOW style is actually a combination of the most common window styles by this method. X,Y: The coordinate of the upper left corner of the window. Normally this values should be CW_USEDEFAULT, that is, you want Windows to decide for you where to put the window on the desktop. nWidth, nHeight: The width and height of the window in pixels. You can also use CW_USEDEFAULT to let Windows choose the appropriate width and height for you. hWndParent: A handle to the window's parent window (if exists). This parameter tells Windows whether this window is a child (subordinate) of some other window and, if it is, which window is the parent. Note that this is not the parent- child relationship of multiple document interface (MDI). Child windows are not bound to the client area of the parent window. This relationship is specifically for Windows internal use. If the parent window is destroyed, all child windows will be destroyed automatically. It's really that simple. Since in our example, there's only one window, we specify this parameter as NULL. hMenu: A handle to the window's menu. NULL if the class menu is to be used. Look back at the a member of WNDCLASSEX structure, lpszMenuName. lpszMenuName specifies *default* menu for the window class. Every window created from this window class will have the same menu by default. Unless you specify an *overriding* menu for a specific window via its hMenu parameter. hMenu is actually a dual-purpose parameter. In case the window you want to create is of a predefined window type (ie. control), such control cannot own a menu. hMenu is used as that control's ID instead. Windows can decide whether hMenu is really a menu handle or a control ID by looking at lpClassName parameter. If it's the name of a predefined window class, hMenu is a control ID. If it's not, then it's a handle to the window's menu. hInstance: The instance handle for the program module creating the window. lpParam: Optional pointer to a data structure passed to the window. This is used by MDI window to pass the CLIENTCREATESTRUCT data. Normally, this value is set to NULL, meaning that no data is passed via CreateWindow(). The window can retrieve the value of this parameter by the call to GetWindowLong function. mov hwnd,eax invoke ShowWindow, hwnd,CmdShow invoke UpdateWindow, hwnd After successful return from CreateWindowEx, the window handle is stored in eax. We must keep this value for future use. The window we just created is not automatically displayed. You must call ShowWindow with the window handle and the desired *display state* of the window to make it display on the screen. Next you can call UpdateWindow to order your window to repaint its client area. This function is useful when you want to update the content of the client area. You can omit this call though. .WHILE TRUE invoke GetMessage, ADDR msg,NULL,0,0 .BREAK .IF (!eax) invoke TranslateMessage, ADDR msg invoke DispatchMessage, ADDR msg .ENDW At this time, our window is up on the screen. But it cannot receive input from the world. So we have to *inform* it of relevant events. We accomplish this with a message loop. There's only one message loop for each module. This message loop continually checks for messages from Windows with GetMessage call. GetMessage passes a pointer to a MSG structure to Windows. This MSG structure will be filled with information about the message that Windows want to send to a window in the module. GetMessage function will not return until there's a message for a window in the module. During that time, Windows can give control to other programs. This is what forms the cooperative multitasking scheme of Win16 platform. GetMessage returns FALSE if WM_QUIT message is received which, in the message loop, will terminate the loop and exit the program. TranslateMessage is a utility function that takes raw keyboard input and generates a new message (WM_CHAR) that is placed on the message queue. The message with WM_CHAR contains the ASCII value for the key pressed, which is easier to deal with than the raw keyboard scan codes. You can omit this call if your program doesn't process keystrokes. DispatchMessage sends the message data to the window procedure responsible for the specific window the message is for. mov eax,msg.wParam ret WinMain endp If the message loop terminates, the exit code is stored in wParam member of the MSG structure. You can store this exit code into eax to return it to Windows. At the present time, Windows do not make use of the return value, but it's better to be on the safe side and plays by the rule. WndProc proc hWnd:HWND, uMsg:UINT, wParam:WPARAM, lParam:LPARAM This is our window procedure. You don't have to name it WndProc. The first parameter, hWnd, is the window handle of the window that the message is destined. uMsg is the message. Note that uMsg is not a MSG structure. It's just a number, really. Windows define hundreds of messages, most of which your programs will not be interested in. Windows will send an appropriate message to a window in case something relevant to that window happens. Thew indow procedure receives the message and react to it intelligently. wParam and lParam are just extra parameters for use by some message. Some message does send accompanying data in addition to the message itself. Those data are passed to the window procedure by means of lParam and wParam. mov eax,uMsg .IF eax==WM_DESTROY invoke PostQuitMessage,NULL xor eax,eax .ELSE invoke DefWindowProc,hWnd,uMsg,wParam,lParam .ENDIF ret WndProc endp Here comes the crucial part. This is where most of your program's intelligence resides. The code that responds to each Windows message are in the window procedure. Your code must check the Windows message to see if it's a message it's interested in. If it is, do anything you want to do in response to that message and then return with zero in eax. If it's not, you MUST pass ALL parameters for default processing by DefWindowProc. This DefWindowProc is an API function that processes the messages your program is not interested in. The only message that you MUST respond to is WM_DESTROY. This message is sent to your window procedure whenever your window is closed. At the time your window procedure receives this message, your window is removed from the screen. This is just a notification that your window is now destroyed, you should prepare yourself to return to Windows. In response to this, you can perform housekeeping prior to return to Windows. You have no choice but to quit when it comes to this state. If you want to have a chance to stop the user from closing your window, you should process WM_CLOSE message. Now back to WM_DESTROY, after performing housekeeping chores, you must call PostQuitMessage which will post WM_QUIT back to your module. WM_QUIT will make GetMessage return with zero value in eax, which in turn, terminates the message loop and quits to Windows. You can send WM_DESTROY message to your own window procedure by calling DestroyWindow function. [Reprinted With permission from Iczelion's Win32 Assembly HomePage] http://203.148.211.201/iczelion/index.html ::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. :::\_____\:::::::::::................................WIN32.ASSEMBLY.PROGRAMMING Painting with Text by Iczelion In this tutorial, we will learn how to "paint" text in the client area of a window. We'll also learn about device context. You can download the source code here. http://203.148.211.201/iczelion/files/tut04.zip Preliminary ----------- Text in Windows is a type of GUI object. Each character is composed of numerous pixels that are lumped together into a distinct pattern. That's why it's called "painting" instead of "writing". Normally, you paint text in your own client area (actually, you can paint outside client area but that's another story). Putting text on screen in Windows is drastically different from DOS. In DOS, you can think of the screen in 80x25 dimension. But in Windows, the screen are shared by several programs. Some rules must be enforced to avoid programs writing over each other screen data. Windows ensures this by limiting painting area of each window to its own client area only. The size of client area of a window is not constant. The user can change the size anytime. So you must determine the dimension of client area dynamically, at runtime. Before you can paint something on the client area, you must ask for permission from Windows. That's right, you don't have absolute control of the screen as you were in DOS. You must ask Windows for permission to paint your own client area. Windows will determine the size of your client area, font, colors and other GDI attributes and send a handle to device context back to your program. You can then use the device context as a passport to painting on your client area. What is a device context? It's just a data structure maintained internally by Windows. A device context is associated with a particular device, such as a printer or video display. For a video display, a device context is usually associated with a particular window on the display. Some of the values in the device context are graphic attributes such as colors, font etc. These are default values which you can change at will. They exist to help reduce the load from having to specify these attributes in every GDI function calls. When a program need to paint, it must obtain a handle to a device context. Normally, there's two ways to accomplish this. call BeginPaint in response to WM_PAINT message. call GetDC in response to other messages. One thing you must remember, after you're through with the device context handle, you must release it during the processing of a single message. Don't obtain the handle in response to one message and release it in response to another. Windows posts WM_PAINT messages to a window to notify that it's now time to repaint its client area. Windows does not save the content of client area of a window. Instead, when a situation occurs that warrants a repaint of client area (such as when a window was covered by another and is just brought back in front), Windows put WM_PAINT message in that window's message queue. It's the responsibility of that window to repaint its own client area. You must gather all information about how to repaint your client area in the WM_PAINT section of your window procedure, so the window procudure can repaint the client area when WM_PAINT message arrives. Another concept you must come to terms with is the invalid rectangle. Windows defines an invalid rectangle as the smallest rectangular area in the client area that needs to be repainted. When Windows detects an invalid rectangle in the client area of a window , it posts WM_PAINT message to that window. In response to WM_PAINT message, the window can obtain a paintstruct structure which contains, among others, the coordinate of the invalid rectangle. You call BeginPaint in response to WM_PAINT message to validate the invalid rectangle. If you don't process WM_PAINT message, at the very least you must call DefWindowProc or ValidateRect to validate the invalid rectangle else Windows will repeatedly send you WM_PAINT message. Here's the steps you perform in response to a WM_PAINT message: Get a handle to device context with BeginPaint. Paint the client area. Release the handle to device context with EndPaint Note that you don't have to explicitly validate the invalid rectangle. It's automatically done by the BeginPaint call. Between BeginPaint-Endpaint pair, you can call any GDI functions to paint your client area. Nearly everyone of them requires a handle to device context as a parameter. Content: We will write a program that display a text string "Win32 assembly is great and easy!" in the center of the client area. include windows.inc includelib user32.lib includelib kernel32.lib .DATA ClassName db "SimpleWinClass",0 AppName db "Our First Window",0 OurText db "Win32 assembly is great and easy!",0 .DATA? hInstance HINSTANCE ? CommandLine LPSTR ? .CODE start: invoke GetModuleHandle, NULL mov hInstance,eax invoke GetCommandLine invoke WinMain, hInstance,NULL,CommandLine, SW_SHOWDEFAULT invoke ExitProcess,eax WinMain proc hinst:HINSTANCE, hPrevInst:HINSTANCE, CmdLine:LPSTR, CmdShow:SDWORD LOCAL wc:WNDCLASSEX LOCAL msg:MSG LOCAL hwnd:HWND mov wc.cbSize,SIZEOF WNDCLASSEX mov wc.style, CS_HREDRAW or CS_VREDRAW mov wc.lpfnWndProc, OFFSET WndProc mov wc.cbClsExtra,NULL mov wc.cbWndExtra,NULL push hInstance pop wc.hInstance mov wc.hbrBackground,COLOR_WINDOW+1 mov wc.lpszMenuName,NULL mov wc.lpszClassName,OFFSET ClassName invoke LoadIcon,NULL,IDI_APPLICATION mov wc.hIcon,eax mov wc.hIconSm,0 invoke LoadCursor,NULL,IDC_ARROW mov wc.hCursor,eax invoke RegisterClassEx, addr wc invoke CreateWindowEx,NULL,ADDR ClassName,ADDR AppName,\ WS_OVERLAPPEDWINDOW,CW_USEDEFAULT,\ CW_USEDEFAULT,CW_USEDEFAULT,CW_USEDEFAULT,NULL,NULL,\ hInst,NULL mov hwnd,eax invoke ShowWindow, hwnd,SW_SHOWNORMAL invoke UpdateWindow, hwnd .WHILE TRUE invoke GetMessage, ADDR msg,NULL,0,0 .BREAK .IF (!eax) invoke TranslateMessage, ADDR msg invoke DispatchMessage, ADDR msg .ENDW mov eax,msg.wParam ret WinMain endp WndProc proc hWnd:HWND, uMsg:UINT, wParam:WPARAM, lParam:LPARAM LOCAL hdc:HDC LOCAL ps:PAINTSTRUCT LOCAL rect:RECT mov eax,uMsg .IF eax==WM_DESTROY invoke PostQuitMessage,NULL .ELSEIF eax==WM_PAINT invoke BeginPaint,hWnd, ADDR ps mov hdc,eax invoke GetClientRect,hWnd, ADDR rect invoke DrawText, hdc,ADDR OurText,-1, ADDR rect, \ DT_SINGLELINE or DT_CENTER or DT_VCENTER invoke EndPaint,hWnd, ADDR ps .ELSE invoke DefWindowProc,hWnd,uMsg,wParam,lParam ret .ENDIF xor eax, eax ret WndProc endp end start The majority of the code is the same as the example in tutorial 3. I'll explain only the important changes. LOCAL hdc:HDC LOCAL ps:PAINTSTRUCT LOCAL rect:RECT These are local variables that are used by GDI functions in our WM_PAINT section. hdc is used to store the handle to device context returned from BeginPaint call. ps is a PAINTSTRUCT structure. Normally you don't use the values in ps. It's passed to BeginPaint function and Windows fills it with appropriate values. You then pass ps to EndPaint function when you finish painting the client area. rect is a RECT structure defined as follows: RECT Struct left LONG ? top LONG ? right LONG ? bottom LONG ? RECT ends Left and top are the coordinates of the upper left corner of a rectangle Right and bottom are the coordinates of the lower right corner. One thing to remember: The origin of the x-y axes is at the upper left corner of the client area. So the point y=10 is BELOW the point y=0. invoke BeginPaint,hWnd, ADDR ps mov hdc,eax invoke GetClientRect,hWnd, ADDR rect invoke DrawText, hdc,ADDR OurText,-1, ADDR rect, \ DT_SINGLELINE or DT_CENTER or DT_VCENTER invoke EndPaint,hWnd, ADDR ps In response to WM_PAINT message, you call BeginPaint with handle to the window you want to paint and an uninitialized PAINTSTRUCT structure as parameters. After successful call, eax contains the handle to device context. Next you call GetClientRect to retrieve the dimension of the client area. The dimension is returned in rect variable which you pass to DrawText as one of its parameter. DrawText's syntax is: int WINAPI DrawText(HDC hdc, LPCSTR lpString, int nCount, LPRECT lpRect, UNIT uFormat); DrawText is a high-level text output API function. It handles some gory details such as word wrap, centering etc. so you could concentrate on the string you want to paint. Its low-level brother, TextOut, will be examined in the next tutorial. DrawText formats a text string to fit within the bounds of a rectangle. It uses the currently selected font,color and background (in the device context) to draw the text.Lines are wrapped to fit within the bounds of the rectangle. It returns the height of the output text in device units, in our case, pixels. Let's see its parameters: hdc handle to device context lpString A pointer to the string you want to draw in the rectangle. The string must be null-terminated else you would have to specify its length in the next parameter, nCount. nCount The number of characters to output. If the string is null- terminated, nCount must be -1. Otherwise nCount must contain the number of characters in the string you want to draw. lpRect A pointer to the rectangle (a structure of type RECT) you want to draw the string in. Note that this rectangle is also a clipping rectangle, that is, you could not draw the string outside this rectangle. uFormat The value that specifies how the string is displayed in the rectangle. We use three values combined by "or" operator: DT_SINGLELINE specifies a single line of text DT_CENTER centers the text horizontally. DT_VCENTER centers the text vertically. Must be used with DT_SINGLELINE. After you finish painting the client area, you must call EndPaint function to release the handle to device context. That's it. We can summarize the salient points here: * You call BeginPaint-EndPaint pair in response to WM_PAINT message. * Do anything you like with the client area between the calls to BeginPaint and EndPaint. * If you want to repaint your client area in response to other messages, you have two choices: Use GetDC-ReleaseDC pair and do your painting between these calls Call InvalidateRect or UpdateWindow to invalidate the entire client area, forcing Windows to put WM_PAINT message in the message queue of your window and do your painting in WM_PAINT section [Reprinted With permission from Iczelion's Win32 Assembly HomePage] http://203.148.211.201/iczelion/index.html ::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. :::\_____\:::::::::::........................THE.C.STANDARD.LIBRARY.IN.ASSEMBLY The _Xprintf functions by Xbios2 I. INTRODUCTION --------------- This is the second article I write on the C standard library, and perhaps some ask: "Why should this interest us?" or, more gently, "What's the philosophy behind these articles?". Well, here is why I write these articles: - For C programmers that want to know what happens behind the HLL 'curtain' - For asm programmers who wish to get ideas - For asm programmers who need a C command but want to keep their code 'slim' (actually the code section is intended more as source to compile and use than source to read and understand, that's why it's not always well-commented in a tutorial-like manner) - For me, to better understand reverse-engineering and assembly coding. Ok, now go for it.... II. WHAT C DOES --------------- How the various _printf functions (_Xprintf) work: The _Xprintf functions call the ___vprinter function, with four parameters: 1. output function address 2. output function parameter 3. pointer to format string 4. pointer to arguments list Parameter 1 is a pointer to a function that outputs the resulting string (to a file, stdout or to memory). Parameter 2 is passed to the function pointed at by parameter 1, together with the string pointer and its length. Parameter 3 is 'forwarded' by _Xprintf exactly as received by the user. Parameter 4 is either 'forwarded' (by the _vXprintf functions) or points to the stack (for 'normal' _Xprintf functions). Functions that send output to a file or to STDOUT also lock/unlock the stream. Besides that, all the 'dirty job' is passed to ___vprinter. How ___vprinter works: [the disassembly of ___vprinter would show this better, but is far too large] 1. Read (next) char from format string 2. If char is NUL, finish 3. If char is not a '%', output it verbatim', loop back to [1] 4. If char is '%' and next char is also '%', output a single '%' and loop to [1] 5. Process the string up to a 'type_char' If everything is ok, output the result, loop to [1] If there is an unknown char, output the rest of the string verbatim, finish It is interesting to notice how ___vprinter does it's output: All output is performed character by character. To do this ___vprinter calls a nother routine (let's call it _storechar) passing it two parameters: the character to store and a pointer to an 80-byte string in the stack of ___vprinter (actually in the C source that must have been a pointer to a local structure, because _storechar also modifies locals after those 80 bytes). _storechar writes the character in the sting and if the string is filled up, it calls a second function (call it _writestring) that calls the function whose pointer was passed to ___vprinter. Before returning, ___vprinter calls _writestring directly to output whatever bytes where left. _writestring is also responsible for setting a flag that will cause ___vprinter (and consequently _Xprintf) to return -1 instead of the number of chars output. This way to perform output has the advantage of printing long strings without allocating much memory, while printing small strings using the output function only once. Actually this is the only advantage it has. Even if this solution was written well (which is _not_), it would still be awful in _sprintf and _vsprintf. In _(v)sprintf chars are written in the local buffer first, then, when this fills up, the second function (_writestring) is called, which calls a third function (included in the same .OBJ file with _sprintf) which finally calls _memcpy. With careful re-writing of sprintf, this could be achieved just by a simple, one-byte 'stosb'. Then printf and fprintf could be implemented atop sprintf. The problem here is that those functions should 'know' how much buffer space to allocate. Maybe the solution to this could be to leave allocating buffers to the user, by just giving a sprintf function (actually Microsoft thought this before me, and they give only wsprintf and wvsprintf in the Win32 API). This article will actually focus on a vsprintf function, with all the format specifiers in Borland C (EXCEPT floating point numbers, which would (and maybe will) require a separate article. Also keep in mind that UNIX has a rather more complicated Xprintf set, which I'm glad to ignore :) III. SOME COMMENTS ON THE CODE ----------------------------- This is not exactly 'clear' code. This is because it was not written from scratch, but is the result of hand-optimization applied to the disassembly of ___vprinter (Actually Borland could sue me for this, but they'd really have a hard time trying to show that my code resembles theirs :)). That is, starting from an uncomprehensible but working source code, I kept changing the source code and compiling until I got a better source code (yet still uncomprehensible :). That's also a reason the code is poorly commented. Anyway if you're just interested in a simple _sprintf function, skip to the code section. For the curious, here are some differences my version has: - A self-contained procedure That is, there is only a _sprintf function, which calls nothing, while _sprintf involves: ___vprinter, ___longtoa, ___strlen, plus three other functions called by ___vprinter (_storechar, _writestring and another one that converts pointers into hex) - Much smaller code - Much less stack used - Probably faster code (actually it is not a speed-optimized version, but yet it must be much faster) - It's home-made, and brand-new :) IV. THE CODE ------------- Well, as I said, you're not expected to understand it at once. Yet, if you insist, read and enjoy... ; sprintf.asm ============================================================ .386 .model flat getarg macro register lea eax, [a_argList] mov edx, [eax] add dword ptr [eax], 4 mov register, [edx] endm .data Null db '(null)',0 align 4 jumptable dd offset BlankOrPlus ; 0 dd offset HashSign ; 1 dd offset Asterisk ; 2 dd offset MinusSign ; 3 dd offset Dot ; 4 dd offset Digit ; 5 dd offset h_shortint ; 6 dd offset d_decimal ; 7 dd offset o_octal ; 8 dd offset u_unsigned ; 9 dd offset x_Hexadecimal ; 10 dd offset p_pointer ; 11 dd offset unknown ; 12 = f_floating dd offset c_char ; 13 dd offset s_string ; 14 dd offset n_CharsWritten ; 15 dd offset formatLoop ; 16 = Ignore character dd offset unknown ; 17 = Unknown char dd offset Percent ; 18 ; ! " # $ % & ' ( ) * + , - . / xxlat db 0, 17, 17, 1, 17, 18, 17, 17, 17, 17, 2, 0, 17, 3, 4, 17 ; 0 1 2 3 4 5 6 7 8 9 : ; < = > ? db 5, 5, 5, 5, 5, 5, 5, 5, 5, 17, 17, 17, 17, 17, 17, 17 ; @ A B C D E F G H I J K L M N O db 17, 17, 17, 17, 17, 12, 16, 12, 8, 17, 17, 17, 16, 17, 16, 17 ; P Q R S T U V W X Y Z [ \ ] ^ _ db 17, 17, 17, 17, 17, 17, 17, 17, 10, 17, 17, 17, 17, 17, 17, 17 ; ` a b c d e f g h i j k l m n o db 17, 17, 17, 13, 7, 12, 12, 12, 6, 7, 17, 17, 16, 17, 15, 8 ; p q r s t u v w x y z { | } ~ DEL db 11, 17, 17, 14, 17, 9, 17, 17, 10, 17, 17, 17, 17, 17, 17, 17 .code _vsprintf proc C near uses ebx edi esi, a_output:dword, a_format:dword, \ a_argList:dword local v_width:dword, v_prec:dword, v_zeroLen:dword, \ v_sign:dword, v_strbuf:byte:12, v_strLen:dword mov esi, [a_format] mov edi, [a_output] mainLoop: lodsb ; get character cmp al, '%' ; test if it is '%' je short controlChar stosb ; if not, just copy it test al, al jnz short mainLoop ; if char is not NULL, loop jmp EndOfString ; jump if char is null ; --------------------------------------------------------------------------- controlChar: xor ecx, ecx ; set stage to 0 or eax, -1 xor ebx, ebx ; no flags set mov [v_width], eax ; no width given mov [v_zeroLen], ecx ; 0 mov [v_prec], eax ; no .prec given mov [v_sign], ecx ; 0, no sign prefix formatLoop: xor eax, eax lodsb cmp al, ' ' jl unknown ; char below ' ' movzx edx, byte ptr xxlat - ' '[eax] jmp jumptable[edx*4] ; we jump with the char in AL ; --------------------------------------------------------------------------- n_CharsWritten: getarg eax mov edx, edi sub edx, [a_output] ; calculate length test ebx, 16 jnz short nchars_short mov [eax], edx jmp short fw_mainloop nchars_short: mov [eax], dx fw_mainloop: jmp mainLoop ; --------------------------------------------------------------------------- Percent: cmp byte ptr [esi-2], al ; al='%' jne unknown stosb jmp mainLoop ; --------------------------------------------------------------------------- ; flag characters HashSign: or ebx, 1 jmp short chkflags MinusSign: or ebx, 2 jmp short chkflags BlankOrPlus: or byte ptr [v_sign], al ; ' ' will become '+' chkflags: or ecx, ecx jnz unknown jmp formatLoop ; --------------------------------------------------------------------------- Asterisk: getarg eax cmp ecx, 2 jge short asterisk_prec test eax, eax jge short width_positive neg eax or ebx, 2 width_positive: mov [v_width], eax mov ecx, 3 jmp short fwwB ; - - - - - - - - - - - - - - - - - - - - - - - asterisk_prec: cmp ecx, 4 jnz unknown inc ecx ; set stage to 5 mov [v_prec], eax fwwB: jmp formatLoop ; --------------------------------------------------------------------------- Dot: cmp ecx, 4 jge unknown mov ecx, 4 inc [v_prec] ; set .prec to 0 jmp formatLoop ; --------------------------------------------------------------------------- Digit: sub al, '0' ; convert ASCII to value jnz short digit2 or ecx, ecx jnz short digit2 test ebx, 2 ; we come here if width=0n jnz short fwwC or ebx, 8 inc ecx ; set stage to 1 jmp fwwC ; - - - - - - - - - - - - - - - - - - - - - - - digit2: cmp ecx, 2 jg short digit_prec mov ecx, 2 cmp [v_width], 0 jge short digit_width mov [v_width], eax jmp short fwwC ; - - - - - - - - - - - - - - - - - - - - - - - digit_width: imul edx, [v_width], 10 add eax, edx mov [v_width], eax jmp short fwwC ; - - - - - - - - - - - - - - - - - - - - - - - digit_prec: cmp ecx, 4 jnz unknown imul edx, [v_prec], 10 add eax, edx mov [v_prec], eax fwwC: jmp formatLoop ; --------------------------------------------------------------------------- h_shortint: or ebx, 16 mov ecx, 5 jmp formatLoop ; --------------------------------------------------------------------------- o_octal: mov ecx, 8 ; radix test ebx, 1 jz short unsigned mov byte ptr [v_sign], '0' jmp short integer u_unsigned: mov ecx, 10 ; radix unsigned: mov byte ptr [v_sign], 0 ; no sign jmp short integer x_Hexadecimal: mov ecx, 16 ; radix mov ah, al xor al, 'X' ; AL is the char ('x' or 'X') mov bh, al test ebx, 1 jz short integer mov al, '0' mov word ptr [v_sign], ax jmp short integer d_decimal: mov ecx, 10 ; radix or ebx, 32 integer: getarg eax test ebx, 16 jz short integer_cnvt ; if not short, don't change short_integer: test ebx, 32 ; is integer signed? jnz short short_signed and eax, 0FFFFh ; zero extend 16 to 32 jmp short nosign short_signed: cwde ; sign extend 16 to 32 integer_cnvt: test ebx, 32 jz nosign or eax, eax jns nosign neg eax mov byte ptr [v_sign], '-' nosign: lea edx, [offset v_strbuf + 11] or eax, eax jnz short ltoa cmp [v_prec], eax ; eax is 0 if we are here jnz short zero mov byte ptr [edx], al ; value 0 with .0 prec mov [v_strLen], eax ; means no string jmp printit ; so output no digits zero: cmp byte ptr [v_sign], '0' jnz short ltoa mov byte ptr[v_sign], 0 ; we don't want 0x0, nor '00' ; convert EAX into ASCII ltoa: push edi push esi xor esi, esi mov edi, edx mov byte ptr [edi], 0 ltoaLoop: xor edx, edx div ecx ; ecx is the radix xchg eax, edx add al,90h daa adc al,40h daa or al, bh ; switch case if needed dec edi inc esi mov [edi], al xchg eax, edx or eax, eax jnz short ltoaLoop mov eax, esi mov edx, edi pop esi pop edi mov [v_strLen], eax mov ecx, [v_prec] or ecx, ecx js noprec ; A precision was given sub ecx, eax jle short skipzerolen mov [v_zeroLen], ecx ; if prec>digits then ; add (prec-digits) '0' jmp short skipzerolen noprec: test ebx, 8 jz short skipzerolen cmp [v_width], 0 jle short skipzerolen ;------------------ ; we come here if width=0n mov ecx, [v_width] sub ecx, eax ; EAX=[v_strLen] jle short skipzerolen mov eax, dword ptr [v_sign] or al, al jz short setzerolen dec ecx shr eax, 8 jz short setzerolen dec ecx js short skipzerolen setzerolen: mov [v_zeroLen], ecx skipzerolen: mov eax, dword ptr [v_sign] or al, al jz short finishint dec [v_width] shr eax, 8 jz short finishint dec [v_width] finishint: mov eax, [v_zeroLen] add [v_strLen], eax jmp printit ; --------------------------------------------------------------------------- ; Pointer: same as %.8X p_pointer: getarg ecx lea edx, [v_strbuf] push ebx mov ebx, 7 loopPointer: mov al, cl shr ecx, 4 and al, 0Fh add al,90h daa adc al,40h daa mov [edx+ebx], al dec ebx jns loopPointer pop ebx mov byte ptr [edx+8], 0 mov [v_strLen], 8 jmp printit ; --------------------------------------------------------------------------- c_char: getarg eax lea edx, [v_strbuf] mov [edx], eax ; stores char (rest of EAX is ; not important) mov [v_strLen], 1 ; set length to one char jmp printit ; --------------------------------------------------------------------------- s_string: getarg edx or eax, -1 test edx, edx jnz short strlen_I mov edx, offset Null ; Pointer 0 prints 'Null' strlen_I: inc eax cmp byte ptr [edx+eax], 0 jnz short strlen_I cmp eax, [v_prec] jle short setLen cmp [v_prec], 0 jl short setLen mov eax, [v_prec] setLen: mov [v_strLen], eax ; --------------------------------------------------------------------------- ; we must arrive here with EDX pointing to the string to print ; and it's length in [v_strLen] ; left pad with spaces IF necessary printit: test ebx, 2 ; Is it left justified? mov ebx, [v_width] jnz short printPrefix ; if yes, don't pad left mov ecx, ebx sub ecx, [v_strLen] jle printPrefix mov al, ' ' rep stosb ; >>> left pad mov ebx, [v_strLen] ; print one- or two-chars PREFIX printPrefix: mov eax, [v_sign] or al, al jz short padZero stosb ; print the sign prefix shr eax, 8 ; AL=AH, AH=0 jz short padZero stosb ; print the sign prefix ; pad with zeroes IF necessary padZero: mov ecx, [v_zeroLen] ; we are sure that ecx>=0 sub [v_strLen], ecx sub ebx, ecx mov al, '0' ; ECX=[v_zeroLen] rep stosb ; >>> pad with 0s mov ecx, [v_strLen] sub ebx, ecx xchg esi, edx rep movsb ; >>> copy string xchg esi, edx js short skipRightpad ; refers to SUB EBX, ECX mov ecx, ebx mov al, ' ' rep stosb ; >>> right pad with ' ' skipRightpad: jmp mainLoop ; --------------------------------------------------------------------------- ; ; If an unknown specification character is found, _vsprintf enters the ; following loop. This loop copies verbatim all the rest of the string ; (from the '%' on) unknown: mov al, '%' scanback: dec esi cmp [esi], al jne short scanback copyrest: lodsb stosb test al, al jnz short copyrest ; ; --------------------------- ; return the number of chars written EndOfString: mov eax, edi sub eax, [a_output] dec eax ret endp ends end ; EOF ==================================================================== ::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. :::\_____\:::::::::::............................................THE.UNIX.WORLD X-Windows in Assembly Language: Part I by mammon_ The sensible way to write programs for X-Windows is to use a toolkit such as Xt or Gtk; the easy way would be to use a scripting package such as Python or Tcl/Tk. Modern assembly language coders, however, are known for sacrificing ease and sensibility in the name of curiosity and execution speed; it is in this spirit that the potential for programming X-Windows in assembly language will now be investigated. X-Windows Programming --------------------- Like other GUIs, X-Windows uses an event-driven programming style in which an application registers itself with the system, displays its main user interface, and waits for system events signalling that the user has interacted with the program. There are four main 'levels' of X-Windows Programming: XProtocol, XLib, Xt or 'toolkit' programming, and scripting. XProtocol X-Windows consists of the X Server which handles graphics output, keyboard and mouse input, event signalling, and commands sent from client programs (Window Managers, applications). Clients communicate with the X Server using XProtocol, which consists of byte streams exchanged between the client and the server -- in a sense, like the packets that a network client exchanges with a network server. XProtocol is virtually useless for application programming, for the coding overhead for each server request makes development impractical. The details of XProtocol requests can be found in '/usr/include/X11/Xproto.h'. XLib The equivalent of the Win32 API in X-Windows is XLib. Even if one uses toolkits for application coding, there is no way to escape XLib coding. XLib serves as an interface between the client programs and the X Server; essentially, it is a library of XProtocol functions exported for use by applications. Xt Toolkit programming is similar to using class libraries (like MFC, OWL, or VCL) on the Win32 platform. There are a number of toolkits available, such as Qt, Gtk, Xt Intrinsics, Athena, and the Motif toolkit. Each toolkit consists of extensible widgets (like resources in Win32) that define basic window types: buttons, scrollbars, dialogs, edit windows, etc. Scripting A wide variety of scripting languages are available for the Unix platorm, and many of these have windowing toolkits that enable them to produce X-Windows applications. The most popular are Tcl/Tk, Python, and Java; needless to say, these programming methods may not be implemented in assembly language. The XLib Programming Model -------------------------- An application written for the XLib interface demonstrates the basic principles of X-Windows programming as a whole. These principles make up a 5-step method: Step I : Connect to the Display The first step of an X-Windows application is also the most simple: a call is made to XOpenDisplay; the result --returned in eax of course-- is a pointer to a Display structure. This should be saved, as it will be required for just about every subsequent call: p_disp = XOpenDisplay( NULL ); Note: I am providing the sample source in C for this section; the assembler reconstruction will be presented later. Step II : Initialize Application Resources (Colors and Fonts) Before a window can be displayed, it requires a Graphic Context (similar to the Win32 DC); before the GC can be created, it requires that the colors and fonts to be used by the window be initialized. The simplest way to do this is to use the XLoadQueryFont and the WhitePixel and BlackPixel macros: mfontstruct = XLoadQueryFont( p_disp, "fixed"); WhitePix = WhitePixel( p_disp, DefaultScreen(p_disp)); BlackPix = BlackPixel( p_disp, DefaultScreen(p_disp)); Once again, the values are saved for later use. Note that a more complex method of allocating colors will be used in the assembly code later; there, a handle to the default X Windows colormap is obtained via a call to XDefaultColormap, and XAllocNamedColor is used to allocate white and black pixel values: this accomplishes the same as the above code, but without using the macros. Step III : Create Window(s) There are four things that must be done to create a window: the window itself is registered with the X Server and given a Resource ID, the GC is registered with the X Server and given its own Resource ID, the window must specify which events it will respond to, and finally the window must be mapped into the X display. Creating the window requires a call to XCreateWindow or XCreateSimpleWindow. XCreateSimpleWindow, used below, requires the display, parent window, x and y screen coordinates, window width and height, border width, border pixel value, and background pixel value. XCreateWindow, used in the assembly version, is passed the display, parent window, x & y, width & height, border width, color depth, window class, visual attribute, value mask, and an XSetWindowAttributes structure. A handle to the created window is returned. Main = XCreateSimpleWindow( p_disp, DefaultRootWindow( p_disp ), 100, 100, 100, 50, 1, BlackPix, WhitePix); Creating a GC is not strictly necessary; however doing without one causes the application appearance to be unpredicatable (I found that the background of my window became transparent). A GC is created by calling XCreateGC, which is passed the display, window handle, value mask, and a GraphicsContextValues structure: theGC = XCreateGC(p_disp, Main,(GCFont | GCForeground | GCBackground), &gcv); Input events are selected using the XSelectInput function, which is passed the display, window handle, and the ORed values of event masks: XSelectInput( p_disp, Main, ExposureMask ); Finally, the window is mapped onto the display (and therefore displayed) with the XMapWindow call, which is relatively self-explanatory: XMapWindow( p_disp, Main ); At this point, the procedure must be created for each child window (buttons, scrollbars, etc); the following shows the creation of a button with its own GC, and selection of the Exposure and ButtonPress event masks: Exit = XCreateSimpleWindow(p_disp, Main, 15, 1, 60, 15, 1, WhitePix, BlackPix); XSelectInput(p_disp, Exit, ExposureMask | ButtonPressMask ); XMapWindow(p_disp, Exit); exitGC = XCreateGC(p_disp, Exit,(GCFont | GCForeground | GCBackground),&gcv); Note that a separate GC is not needed for each window if they will be sharing the same background, foreground, and font colors. Step IV : Event Loop The event loop is the 'meat' of the program, where the application responds to user events. This loop calls XNextEvent to get the next system event, and responds to the ones sent to its windows. The following loop catches the Expose event and draws text into each window using XDrawString on the initial exposure of each window (xexpose.count ==0). In addition, when the Exit button is pressed, the while loop exits and the application terminates. while( !Done ){ XNextEvent(p_disp, &theEvent); if( theEvent.xany.window == Main){ if( theEvent.type == Expose && theEvent.xexpose.count == 0){ XDrawString(p_disp, Main, theGC, 1, 40, msgtext, strlen(msgtext)); } } if( theEvent.xany.window == Exit){ switch(theEvent.type){ case Expose: if( theEvent.xexpose.count == 0){ XDrawString(p_disp, Exit, exitGC, 2, 11, extext, strlen(extext) ); } break; case ButtonPress: Done = 1; } } } Step V : Clean Up and Close Display At this point the application is over; the various handles must be freed, the windows destroyed, and the display closed. The functions typically used for this are demonstrated below: XFreeGC(p_disp, theGC); XFreeGC(p_disp, exitGC); XUnloadFont(p_disp, mfontstruct->fid); XDestroyWindow(p_disp, Main); XCloseDisplay(p_disp); exit(0); Note that sll of the functions, structures, and messages used above are defined in '/usr/include/X11/Xlib.h', './X11/Xutil.h' and './X11/X.h'. Inline Assembler With GCC ------------------------- Due to the presence of the GAS assembler within GCC, inline assembler is pretty straightforward. In GCC, the 'asm' keyword is used to prefix a block of asm instructions; the format of 'asm' is as follows: asm( statements : output vars : input vars : modified registers); Note that the last three parameters are usually used only if you are writing an entire function in assembly language, or if you are modifying registers that you do not save (it is better to save all the registers that you will modify, if they contain values that will be needed later). The asm statements are passed directly to GAS, and thus they need to be in a format that GAS will recognize. For this reason, multiline asm statements will require a newline (and, optionally, a tab) after each statement, like so: asm( " statement1 \n statement2 \n statement3 \n statement4" : "g" (outvar) : "g" (invar) : eax, ebx, ecx ); or, as I have used below: asm( "statement1 \n\t" "statement2 \n\t" "statement3 \n\t" "statement4 \n\t"); Other than that there are no real restrictions. Structures do not pass well between C and GAS; if you need to reference specific structure variables from inline assembly code, it is better to place those variables into temporary C variables, whcih can then be accessed from the assembler block as normal. The following demonstrates this: fid = mfontstruct->fid; asm( " push fid\n push mainGC\n push p_disp\n call XSetFont\n add $12, %esp"); More information on the GCC inline assembler can be found at: Avly's Programming Page (http://www.castle.net/~avly/djasm.html) CodeX Software (http://www.gameprog.com/codex/tut/att_asm.html) Brennan's DGPP Resources (http://brennan.home.ml.org/djgpp/) [Currently Down] The XHell Sample Program ------------------------ In order to be able to use the C header files for X-Windows, the following program has been written in C for GCC, using C code for the data declarations and assembler for the 'meat' of the program. In Part II of this article (next issue) I will convert this program to the Xt model and implement it in NASM. // xhell.c ============================================================ #include <X11/Xlib.h> #include <X11/Xutil.h> /* ==================== Global Variable Declarations ===================== */ char *msgtext = "You are in XHell", *extext = "Exit XHell", *m_font = "fixed", *app_name = "xhello", *window_title = "XHell", *szWhite = "white", *szBlack = "black"; XFontStruct *mfontstruct; Display *p_disp; Window Main, Exit; GC mainGC, exitGC; XEvent theEvent; Font fid; Colormap cmap; int Done = 0; unsigned long pxBlack, pxWhite; XSetWindowAttributes xswa; XColor pixBlack, pixWhite; XGCValues gcv; /* ================ Start of Main Function ==================== */ main() { /* ===== Connect to Display ===== */ asm( "push $0\n\t" "call XOpenDisplay\n\t" "movl %eax, p_disp\n\t" "add $4, %esp\n\t"); /* ===== Setup Colors n' Fonts ===== */ asm( "push m_font\n\t" "push p_disp\n\t" "call XLoadQueryFont\n\t" "add $8, %esp\n\t" "movl %eax, mfontstruct"); /* ===== Prepare Main Window ===== */ fid = mfontstruct->fid; /* ===== Create Main Graphics Context ===== */ // Obtain Colormap Handle asm( "push p_disp\n\t" "call XDefaultScreen\n\t" "add $4, %esp\n\t" "push %eax\n\t" "push p_disp\n\t" "call XDefaultColormap\n\t" "add $8, %esp\n\t" "movl %eax, cmap"); // Allocate White and Black Colors asm( "push $pixWhite\n\t" "push $pixWhite\n\t" "push szWhite\n\t" "push cmap\n\t" "push p_disp\n\t" "call XAllocNamedColor\n\t" "add $20, %esp"); asm( "push $pixBlack\n\t" "push $pixBlack\n\t" "push szBlack\n\t" "push cmap\n\t" "push p_disp\n\t" "call XAllocNamedColor\n\t" "add $20, %esp"); xswa.background_pixel = pixWhite.pixel; asm( "push $xswa\n\t" "movl $1, %ebx\n\t" "shl $1, %ebx\n\t" //CWBackPixel = 1 << 1 "push %ebx\n\t" "push $0\n\t" //CopyFromParent = 0 (X.h) "push $1\n\t" //InputOutput = 1 (X.h) "push $0\n\t" //CopyFromParent = 0 (X.h) "push $1\n\t" "push $50\n\t" "push $100\n\t" "push $100\n\t" "push $100\n\t" "push p_disp\n\t" "call XDefaultRootWindow\n\t" "add $4, %esp\n\t" "push %eax\n\t" "push p_disp\n\t" "call XCreateWindow\n\t" "add $48, %esp\n\t" "movl %eax, Main"); gcv.font = fid; asm( "push $gcv\n\t" "movl $1, %ebx\n\t" "shl $14, %ebx\n\t" //GCFont = 1 << 14 "push %ebx\n\t" "push Main\n\t" "push p_disp\n\t" "call XCreateGC\n\t" "add $16, %esp\n\t" "movl %eax, mainGC"); pxBlack = pixBlack.pixel; pxWhite = pixWhite.pixel; asm( "push fid\n\t" "push mainGC\n\t" "push p_disp\n\t" "call XSetFont\n\t" "push pxBlack\n\t" "push mainGC\n\t" "push p_disp\n\t" "call XSetForeground\n\t" "push pxWhite\n\t" "push mainGC\n\t" "push p_disp\n\t" "call XSetBackground\n\t" "add $36, %esp"); asm( "movl $1, %ebx\n\t" "shl $15, %ebx\n\t" //ExposureMask = 1 << 15 "push %ebx\n\t" "push Main\n\t" "push p_disp\n\t" "call XSelectInput\n\t" "add $12, %esp"); asm( "push Main\n\t" "push p_disp\n\t" "call XMapWindow\n\t" "add $8, %esp"); /* ===== Create Child Windows ===== */ asm( "push pxWhite\n\t" "push pxBlack\n\t" "push $1\n\t" "push $15\n\t" "push $60\n\t" "push $1\n\t" "push $15\n\t" "push Main\n\t" "push p_disp\n\t" "call XCreateSimpleWindow\n\t" "movl %eax, Exit\n\t" "add $36, %esp"); asm( "movl $1, %ebx\n\t" "shl $15, %ebx\n\t" //ExposureMask = 1 << 15 "movl $1, %ecx\n\t" "shl $2, %ecx\n\t" //ButtonPressMask = 1 << 2 "or %ecx, %ebx\n\t" "push %ebx\n\t" "push Exit\n\t" "push p_disp\n\t" "call XSelectInput\n\t" "add $12, %esp"); gcv.foreground = pxBlack; gcv.background = pxWhite; asm( "push $gcv\n\t" "movl $1, %ebx\n\t" "shl $14, %ebx\n\t" //GCFont = 1 << 14 "movl $1, %ecx\n\t" "shl $2, %ecx\n\t" //GCForeground = 1 << 2 "or %ecx, %ebx\n\t" "movl $1, %ecx\n\t" "shl $3, %ecx\n\t" //GCBackground = 1 << 3 "or %ecx, %ebx\n\t" "push %ebx\n\t" "push Exit\n\t" "push p_disp\n\t" "call XCreateGC\n\t" "add $16, %esp\n\t" "movl %eax, exitGC"); asm( "push Exit\n\t" "push p_disp\n\t" "call XMapWindow\n\t" "add $8, %esp"); /* ===== Event Loop ===== */ while( !Done ){ //Implemented in C to save space ;) XNextEvent(p_disp, &theEvent); if( theEvent.xany.window == Main){ if( theEvent.type == Expose && theEvent.xexpose.count == 0){ asm( "push $16\n\t" "push msgtext\n\t" "push $40\n\t" "push $1\n\t" "push mainGC\n\t" "push Main\n\t" "push p_disp\n\t" "call XDrawString\n\t" "add $28, %esp"); } } if( theEvent.xany.window == Exit){ switch(theEvent.type){ case Expose: if( theEvent.xexpose.count == 0){ XDrawString(p_disp, Exit, exitGC, 2, 11, extext, strlen(extext) ); } break; case ButtonPress: Done = 1; } } } /* ===== Close Display ===== */ asm( "push mainGC\n\t" "push p_disp\n\t" "call XFreeGC\n\t" "add $8, %esp\n\t" "push exitGC\n\t" "push p_disp\n\t" "call XFreeGC\n\t" "add $8, %esp\n\t" "push fid\n\t" "push p_disp\n\t" "call XUnloadFont\n\t" "add $8, %esp\n\t" "push Main\n\t" "push p_disp\n\t" "call XDestroyWindow\n\t" "call XCloseDisplay\n\t" "add $8, %esp"); } ; EOF ================================================================= As you can see, producing an XLib program in assembly language is rather unwieldly. The code produced is primarily data manipulations and C calls; there is not a lot that assembly has to offer, even in the event loop. In fact, the only real optimization --aside from overhead added by the compiler, which in the above case we do not bypass-- is in the use of straight calls rather than the macros my original C "hello world" relied on. While this in itself is somewhat of a triumph --for by coding the C application in assembler you learn exactly how much superfluous code there was to get rid of-- it is not enough. In the next issue, I will cover Xt programming in assembler, which will use widgets/resources rather than create windows from scratch, therefore placing the bulk of the code in existing system libraries and therefore making the resultant application much smaller. ::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. :::\_____\:::::::::::................................ASSEMBLY.LANGUAGE.SNIPPETS IsASCII? by Troy Benoist ;Summary: Routine to test whether value in AH is ASCII or not (0-127d = ASCII) ;Compatibility: All DOS versions ;Notes: 4 BYTES! Input: AH=value to check. cmp ah,80 ;8DFC80 Compare value in AH to 128 and set flags. salc ;D6 Set AL=FF if CF=1, or set AL=0 if CF=0. ;REGISTERS DESTROYED: AL RETURNS: AL=0 if AH is not ASCII, FF is so. ENUM by mammon_ ;Summary: A NASM macro emulating the C 'ENUM" command ;Assembler: NASM %macro ENUM 2-* ;Usage: ENUM int SYMBOLS %assign i %1 ; where int is the number to begin enumeration at [0] %rep %0 ; SYMBOLS is a list of Symbols to define %2 EQU 0xi ;Example: ENUM 0 TRUE FALSE %assign i i+1 ; this EQUates TRUE to 0 and FALSE to 1 %rotate 1 ;Example: ENUM 11 JACK QUEEN KING %endrep ; this EQUs JACK to 11, QUEEN to 12, KING to 13 %endmacro CallTable by mammon_ ;Summary: Error Handler to demonstrate call-tables ;Compatibility: ;Notes: The EQUs define offsets from the start of ErrorHandler. Thus, ; ERROR_FILE_NOT_FOUND is at offset 0, ERROR_FILE_READ_ONLY is ; at offset 4 ( one dword from offset 0), etc. ; Each entry in the call table contains the address of the ; code label listed there...so, in order, ErrorHandler contains ; the addresses for the functions ERROR1, ERROR2, ERROR3, and ; ERROR4. ; The code to call an error handler uses as its base ; call [Errorhandler] ; or, call the function whose address is stored at location ; ErrorHandler. By adding the EQUs to this base, one gets the ; offset for each function within ErrorHandler. ERROR_FILE_NOT_FOUND EQU 0 ERROR_FILE_READ_ONLY EQU 4 ERROR_DISK_FULL EQU 8 ERROR_UNKNOWN EQU 12 ErrorHandler: ;------------ Here lies the Call-Table DWORD ERROR1 DWORD ERROR2 DWORD ERROR3 DWORD ERROR4 ;------------ Here ends the Call-Table ;Handlers for various errors; offsets to these are stored in the Call-Table ERROR1: ...Code to Create File... ret ERROR2: ...Code to CHMOD File... ret ERROR3: ...Code to Display Disk Full Message... jmp Exit_Program ERROR4: ...Code to Display Unknown System Error-Code... jmp Exit_Program ;Code to call Various errors call dword ptr [ErrorHandler + ERROR_FILE_NOT_FOUND] call dword ptr [ErrorHandler + ERROR_FILE_READ_ONLY] jmp dword ptr [ErrorHandler + ERROR_FILE_DISK_FULL] jmp dword ptr [ErrorHandler + ERROR_FILE_UNKNOWN] ::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. :::\_____\:::::::::::...........................................ISSUE.CHALLENGE PE Program Displays Its Command Line by Xbios2 The Challenge ------------- Write the smallest possible PE program (win32) that outputs it's command line. The Solution ------------ This problem looks like the one about the 11-byte .COM program solved on the previous issue. Yet the method used to solve it is entirely different. This is because while .COM files include just raw code and data, the PE files include a header with information on the file. It is this header that must be 'tweaked' to get a small file. Before going on, some things must be cleared: 1. This article relies _heavily_ on "The PE File Format" by B.Luevelsmeyer (whom I really thank). You are advised to find the .txt and read it. Of course Microsoft provides it's own documentation but they would hardly ever say 'this seems to be ignored' for their own format. 2. If you think that PE (Portable Exexutable) is the format introduced by win95 you're wrong. Not only was PE created for winNT, but it also seems that win95 is not 100% PE compatible. Anyway, this article has been written for winNT, and I don't think anything will run in windows 95. 3. This article was based on a 'trial and error' method. Some solutions exist only because they work. So don't ask why... (Actually the trial and error resulted in two BSODs, thus proving that a program can crash windows NT without even running it's own code) 4. No, I'm not paranoid. I just like pushing things to their limit :) Now, on to the solution... The code to print the command line looks like this: ----------------- normal.asm ----------------------- .386 .model flat extrn GetCommandLineA:proc extrn GetStdHandle:proc extrn WriteFile:proc .data? dummy db ? .code start: call GetCommandLineA xor ecx, ecx push ecx loop1: inc ecx cmp byte ptr [eax+ecx], 0 jne short loop1 push esp push ecx push eax push -11 call GetStdHandle push eax call WriteFile ret ends end start ---------------------------------------------------- some comments on the code: - the .data? section is present because I can't make TASM work without any data - there is no ExitProcess. In it's place there is a simple 'ret'. This is because the entry point is actually called by kernel32 with the following piece of code: call [ebp+8] ; [ebp+8] holds the entry point address push eax jmp label: ... label: call ExitThread This program compiles under TASM to 4 KB long. Those 4096 bytes are divided like this: Dos Stub 256 PE Header 248 4 section headers 160 padding 872 ------------------------ code 50 padding 462 imports 132 padding 380 reloc 16 padding 1520 This means that we have: 16% header 5% code / data 79% padding It seems that TASM can't create anything smaller. So, the code will have to be written by hand in a hex editor. Actually you don't have to worry, as you'll only have to write 192 bytes for the final program (believe it or not!). In order to shrink the file, the following steps must be taken: Remove Padding, Use a Single Section, Remove the DOS Stub, Tweak the PE Header, Squeeze the Code, Squeeze the Imports, and 'ReAssemble' the Program. 1. Remove padding ----------------- By changing the 'FileAlignment' field in the PE header, all the padding can be discarded. (Actually it seems that win95 won't allow this) 2. Use one section ------------------ TASM creates the following sections: .code : code .data : initialized and uninitialized data .idata : imports .reloc : relocation info -The .reloc section is not needed, as only DLLs get relocated -The .data sectionis only present because I can't have TASM create a normal executable without a data section. -The .idata section can then be merged with the .code section. Remember that the name of each section does not depend on what the section contains, since the OS finds things like imports, relocations or resources from the directory in the PE header. 3. No DOS stub -------------- All compilers that compile PE executables create a DOS stub that displays a message like 'This program must be run under Win32'. Yet this is NOT required by the PE format. What PE needs (as seen in [ntdll.dll]RtlImageNtHeader or [imagehlp.dll]ImageNtHeader) is: PIECE I: DOS HEADER --------------------------------------------- 0000| 4D5A **** **** **** **** **** **** **** 0010| **** **** **** **** **** **** **** **** 0020| **** **** **** **** **** **** **** **** 0030| **** **** **** **** **** **** ???? ???? where ???? is the offset of the PE header from the beginning of the file 4. Tweaked PE header -------------------- The PE header consists of the following structures: IMAGE_NT_SIGNATURE: 00004550h IMAGE_FILE_HEADER: WORD Machine ; >> 014Ch for Intel 386 WORD NumberOfSections ; 1 for this example DWORD TimeDateStamp ; * DWORD PointerToSymbolTable ; * DWORD NumberOfSymbols ; * WORD SizeOfOptionalHeader ; >> 70h (Opt. header + directories) WORD Characteristics ; >> 0102h for 32bit executable IMAGE_OPTIONAL_HEADER: WORD Magic ; 0B01h BYTE MajorLinkerVersion ; * BYTE MinorLinkerVersion ; * DWORD SizeOfCode ; * DWORD SizeOfInitializedData ; * DWORD SizeOfUninitializedData ; * DWORD AddressOfEntryPoint ; >> ???? RVA of entry point DWORD BaseOfCode ; * DWORD BaseOfData ; * DWORD ImageBase ; >> 00100000h for this example DWORD SectionAlignment ; 2 DWORD FileAlignment ; 2 WORD MajorOperatingSystemVersion ; * WORD MinorOperatingSystemVersion ; * WORD MajorImageVersion ; * WORD MinorImageVersion ; * WORD MajorSubsystemVersion ; >> 0004 WORD MinorSubsystemVersion ; >> 0000 DWORD Win32VersionValue ; * DWORD SizeOfImage ; >> ???? DWORD SizeOfHeaders ; * DWORD CheckSum ; * WORD Subsystem ; 0003 for win32 console application WORD DllCharacteristics ; * DWORD SizeOfStackReserve ; 00100000h DWORD SizeOfStackCommit ; 00001000h DWORD SizeOfHeapReserve ; 00100000h DWORD SizeOfHeapCommit ; 00001000h DWORD LoaderFlags ; * DWORD NumberOfRvaAndSizes ; 2 data directories (Exports & Imports) ...a number (actually 2) of the following: IMAGE_DATA_DIRECTORY: DWORD VirtualAddress ; 0 for exports, ???? for imports DWORD Size ; 0 for exports, ???? for imports ...a number (actually 1) of the following: IMAGE_SECTION_HEADER: BYTE Name[8] ; * (Anything we like) DWORD VirtualSize ; ?! (h.o. word must be zero??) DWORD VirtualAddress ; >> ???? DWORD SizeOfRawData ; >> ???? DWORD PointerToRawData ; >> ???? DWORD PointerToRelocations ; * DWORD PointerToLinenumbers ; * WORD NumberOfRelocations ; * WORD NumberOfLinenumbers ; * DWORD Characteristics ; * So the raw hex data for the PE header are: PIECE II: PE HEADER --------------------------------------------- | 5045 0000 4C01 0100 **** **** **** **** | **** **** 7000 0201 0B01 **** **** **** | **** **** **** **** ???? ???? **** **** | **** **** 0000 1000 0200 0000 0200 0000 | **** **** **** **** 0400 0000 **** **** | ???? ???? **** **** **** **** 0300 **** | 0000 1000 0010 0000 0000 1000 0010 0000 | **** **** 0200 0000 0000 0000 0000 0000 | ???? ???? ???? ???? **** **** **** **** | **** **** ???? ???? ???? ???? ???? ???? | **** **** **** **** **** **** **** **** NOTES: - ???? means that the value is needed but has to be filled in later as it depends on the code - **** means that the value is either completely ignored or it can be set to any value without raising an error - the main difference between this and a 'normal' PE header is that the size of the optional header is 70h (112 bytes) instead of the standard 0E0h (224 bytes). This is because there are only 2 directories instead of 16. This seems to be the minimum number of directories possible, as there seems to be no way of running an .exe that has no imports. 5. Squeezed code ---------------- Even though the code we have is already tight, it has one major drawback: It invokes three API functions. To realize what this means just think that the names of the functions are included in the imports section as normal ASCII which means that only the names would take 36 bytes... The solution here (since those functions are needed) is to call the functions directly. This is possible because kernel32.dll is never relocated so the function entry points are always the same (for a given version of windows). For NT4 those values are: GetStdHandle: 77F01CBB WriteFile : 77F0D354 GetCommandLine is a special case since it has the format: GetCommandLineA proc near mov eax, [77F4657Ch] retn GetCommandLineA endp so the final code will look like: ----------------- code.hex ----------------------- A17C65F477 mov eax, offset CommandLine BEBB1CF077 mov esi, offset GetStdHandle 33C9 xor ecx, ecx 51 push ecx 41 inc ecx 803C0800 cmp [eax+ecx], 0 75F9 jnz -07 54 push esp 51 push ecx 50 push eax 6AF5 push -11 ; StdOut FFD6 call esi ; GetStdHandle 50 push eax B854D3F077 mov eax, offset WriteFile FFD0 call eax C3 ret -------------------------------------------------- 6. Squeezed imports ------------------- [Comment: read a text on PE format to better understand what's going on] As mentioned earlier, the PE file must have an imports directory in order to load properly. Yet, since we call API functions directly, we only have to specify one dummy import. A good choice (since it really has a short name) is 'Arc' from 'gdi32.dll'. To specify this imported function we should need: IMAGE_IMPORT_DESCRIPTOR for gdi32.dll: OriginalFirstThunk ; * TimeDateStamp ; * ForwarderChain ; * Name ; >> ???? RVA of ASCII string 'gdi32.dll',0 FirstThunk ; >> ???? RVA described later... IMAGE_IMPORT_DESCRIPTOR full of zeroes to specify end of imports OriginalFirstThunk ; * TimeDateStamp ; * ForwarderChain ; * Name ; 0 This is checked to see if it is the end... FirstThunk ; * 'FirstThunk' is the RVA of a 0-terminated list of RVAs, one for each function in the specified DLL. For this example we only need one RVA followed by a null dword. This RVA will point to a structure IMAGE_IMPORT_BY_NAME: WORD Hint ; * BYTE Name[...] ; 'Arc',0 By putting all this together we would have: PIECE III: IMPORTS --------------------------------------------- | **** **** **** **** **** **** -dword 1- | -dword 2- -dword 3- 0000 0000 **** **** | 0000 0000 **** **** dwords 1 and 2 are the two RVAs for the IMAGE_IMPORT_DESCRIPTOR. dword 3 is the RVA to the IMAGE_IMPORT_BY_NAME. So, dword 2 is the RVA of dword 3. We also need space for the two strings 'gdi32.dll',0 and 'Arc',0. There is a way to use even less bytes for the imports. Just remember that the imports are examined after the file has been mapped into memory. So, since memory is allocated in blocks, after the end of the file there will be a space full of zeroes. So by placing the three dwords in the last 12 bytes of the file, there is no need for the two zeroes. 7. 'Assemble' the program ------------------------- The values marked as ???? will be: Offset of PE header : 00000010 AddressOfEntryPoint : 00000002 SizeOfImage : 000000C0 Imports RVA : 000000A8 Imports Size : 00000028 Section VirtualAddress : 00000000 Section SizeOfRawData : 000000C0 Section PointerToRawData: 00000000 Dll Name RVA : 00000098 Dll FirstThunk RVA : 000000BC Dll Function Hint/Name : 000000AE Notice that the Section data and the Header (DOS and PE) are the same thing. The section RVA is 0, so file offset and RVAs are the same. The code will be broken in three pieces, connected by two jumps. The final result will be: THE PROGRAM --------------------------------------------- 0000| 4D5A A17C 65F4 77BE BB1C F077 33C9 EB08 0010| 5045 0000 4C01 0100 5141 803C 0800 75F9 0020| 5451 EB06 7000 0201 0B01 506A F5FF D650 0030| B854 D3F0 77FF D0C3 0200 0000 1000 0000 0040| 0000 1000 0200 0000 0200 0000 0050| 0400 0000 0060| C000 0000 0300 0070| 0000 1000 0010 0000 0000 1000 0010 0000 0080| 0200 0000 0000 0000 0000 0000 0090| A800 0000 2800 0000 6764 6933 322E 646C 00A0| 6C00 0000 0000 0000 C000 0000 0000 0000 00B0| 4172 6300 9800 0000 BC00 0000 AE00 0000 Blank bytes are meaningless, and can be set to any value. Wrapping Up ----------- Well, if you managed to read up to here, and understood what happened, I guess you need no more explanations. I just gave an idea (actually MANY ideas). Maybe on another article I will start exploring the possibilities this 'experiment' showed me... Next Issue Challenge -------------------- Write a routine for converting ASCII hex to binary in 6 bytes. ::/ \::::::. :/___\:::::::. /| \::::::::. :| _/\:::::::::. :| _|\ \::::::::::. :::\_____\:::::::::::.......................................................FIN